Hi, can you give the error that is printed out?
From the looks of this example this should be connected automatically actually
https://github.com/allegroai/clearml/blob/master/examples/frameworks/hydra/hydra_example.py
@<1715538373919117312:profile|FoolishToad2> , I think you're missing something. ClearML backend only holds references (links) to artifacts. Actual interaction with storage is done directly via the SDK, aka on the machine running the code
Hi @<1845635622748819456:profile|PetiteBat98> , metrics/scalars/console logs are not stored on the files server. They are all stored in Elastic/Mongo. Files server is not required to use. default_output_uri will point all artifacts to your Azure blob
Hi @<1547028074090991616:profile|ShaggySwan64> , how are you currently saving models? What framework are you using? Usually the output models are listed in the 'artifacts' section of a task and on the model side, there is the 'lineage' tab to see which task created the model and what other tasks are using it as input.
Hi @<1853608151669018624:profile|ColossalSquid53> , if there is no connectivity to the clearml server, your python script will run regardless. clearml
will cache all logs/events and then flush them once connectivity to the server is resumed.
Check the queue, do you have step 1 enqueued?
Hi @<1547028074090991616:profile|ShaggySwan64> , You can try this. However, Elastic takes space according to the amount of metrics you're saving. Clearing some older experiments would free up space. What do you think?
It means that there is an issue with the drivers. I suggest trying this docker image - nvcr.io/nvidia/pytorch:23.04-py3
Is it the services docker that comes with the docker compose or did you run your own agent?
Hi @<1523701083040387072:profile|UnevenDolphin73> , I'm not exactly sure, you're trying to set the credentials through the code itself but before Task.init()?
Hi UnevenDolphin73 ,
I think you need to lunch multiple instances to use multiple creds.
Hi @<1856144871656525824:profile|SparklingFly7> , can you describe the issue you're experiencing? I saw there is a new response in github - None
That's the controller. I would guess if you fetch the controller you can get it's id as well
Hi DelightfulElephant81 , you mean if you can self host ClearML yourself?
Results -> Scalars 🙂
Hi @<1597762318140182528:profile|EnchantingPenguin77> , I don't see any errors related to CUDA in the log
@<1597762318140182528:profile|EnchantingPenguin77> , are you sure you added the correct log? I don't see any errors related to cuda
I meant that maybe you ran it with a newer version of the SDK
I think this is what you're looking for - None
Hi PerplexedElk26 , It seems you are correct. This capability will be added in the next version of the server.
Hi @<1648134232087728128:profile|AlertFrog99> , I don't think there is an automatic way to do this out of the box but I guess you could write some automation that does that via the API
Hi TartBear70 ,
Did you run the experiment locally first? What versions of clearml/clearml-agent are you using?
Hi @<1673863775326834688:profile|SucculentMole19> , Unfortunately the open source does not support any authentication beyond the user/pass option.
If authentication is key, then the enterprise has full SSO integration including RBAC.
Hi @<1855782479961526272:profile|CleanBee5> , I think you're using the old repository.
None is what you need 🙂
Hi @<1523701083040387072:profile|UnevenDolphin73> , can you please elaborate?
Hi @<1523701083040387072:profile|UnevenDolphin73> , not in the open source
Hi @<1736556881964437504:profile|HelplessFly7> , I don't think there is such an integration. Currently Poetry, pip and Conda are supported. I think you could make a PR on this for the clearml-agent
Hi RoundMosquito25 , how are you building the pipeline? Is the pipeline controller run locally or on services queue?