Always good to use the latest 🙂
Hi @<1572032849320611840:profile|HurtRaccoon43> , I'd suggest trying this docker image: nvcr.io/nvidia/pytorch:23.03-py3
Hi @<1734020208089108480:profile|WickedHare16> , you mean for viewing S3 images in the web UI for example?
You can export it in the same shell you run the agent in and that should work for example
export FOO=bar clearml-agent daemon ...
Hi @<1731483438642368512:profile|LoosePigeon2> , can you add the full log of the run please?
Again, I'm telling you, please look at the documentation and what it says specifically on minio like solutions.
The host should behost: "
our-host.com :<PORT>"
And NOThost: "
s3.our-host.com "
Maybe you don't require a port I don't know your setup, but as I said, in the host settings you need to remove the s3 as this is reserved only to AWS S3.
The highlighted line is exactly that. Instead of client.tasks.get_all()
I think it would be along the lines of client.debug.ping()
Then these should be by default killed by the ClearML server after a few hours. How long was it stuck?
Hi @<1743079866477056000:profile|ItchySeahorse7> , there is no such capability currently but it sounds like a good idea, maybe open a GitHub feature request for it?
Hi @<1523701092473376768:profile|SuperiorPanda77> , I think a PR would always be appreciated. I don't see any issues with using task.reset()
The ubuntu is the client side or you changed OS on the server side?
Hi @<1590514584836378624:profile|AmiableSeaturtle81> , what is the version of your ClearML server?
Hi @<1736919317200506880:profile|NastyStarfish19> , the services queue is for running the pipeline controller itself. I guess you are self hosting the OS?
@<1706116294329241600:profile|MinuteMouse44> , did you set this in the clearml.conf of the agent as well as the machine of the SDK?
GrievingTurkey78 , do you have iterations stated explicitly somewhere in the script?
And in what section are you setting the environment?
I'm sorry. I think I wrote something wrong. I'll elaborate:
The SDK detects all the packages that are used during the run - The Agent will install a venv with those packages.
I think there is also an option to specify a requirements file directly in the agent.
Is there a reason you want to install packages from a requirements file instead of just using the automatic detection + agent?
Hi @<1523701842515595264:profile|PleasantOwl46> , I think that is what happening. If server is down, code continues running as if nothing happened and ClearML will simply cache all results and flush them once server is back up
It seems like a networking issue on your side. ClearML isn't blocking anything. Most likely unrelated to connection speed but to DNS or something related.
What if you connect using your phone hotspot or another provider?
Also I think it should start with None
Hi @<1724235687256920064:profile|LonelyFly9> , what is the reason you're getting 503 from the service ?
Hi @<1752501940488507392:profile|SquareMoth4> , you have to bring your own compute. ClearML only acts as a control plane allowing you to manage your compute. Why not use AWS for example as a simple solution?
JuicyFox94 , can you please assist? 🙂
Hi @<1743079861380976640:profile|HighKitten20> , what if you try to set the extra flags like this ["-e", "."]
Which container version though?
What is the combination of --storage
and configuration that worked in the end?
Hi @<1715900788393381888:profile|BitingSpider17> , you can run the agent in --debug
mode and this should pass it over to the internal agent running the code
It is part of the Scale/Enterprise SDK, metadata is registered with built in functions of the SDK. Long story short, you can visualize everything and register whatever formats. The system is agnostic to any particular format.
Reproduces for me as well. Taking a look what can be done 🙂
Hi @<1695969549783928832:profile|ObedientTurkey46> , this capability is only covered in the Hyperdatasets feature. There you can both chunk and query specific metadata.
None