Hi FranticLobster32 , what version of ClearML, of Agent & Hydra are you using?
I see. When you're working with catboost, as what type of object is it being passed?
Hi @<1524922424720625664:profile|TartLeopard58> , projects and many other internals like tasks are all saved in internal databases of the ClearML server, specifically in mongo & elastic
How did you install your clearml server?
MelancholyElk85 if you're using add_function_step()
it has a 'docker' parameter. You can read more here:
https://clear.ml/docs/latest/docs/references/sdk/automation_controller_pipelinecontroller#add_function_step
SubstantialElk6 , do you mean the dataset task version?
Hi @<1638712150060961792:profile|SilkyCrocodile89> , can you please add the log and an example of a failure and of it working?
I mean what python version did you initially run it locally?
Can you please provide a stand alone snippet that reproduces this behavior? Can you provide a log of the run?
I would suggest structuring everything around the Task object. After you clone and enqueue the agent can handle all the required packages / environment. You can even set environment variables so it won't try to create a new env but use the existing one in the docker container.
Hi @<1526371965655322624:profile|NuttyCamel41> , can you add the full log?
Hi @<1639799308809146368:profile|TritePigeon86> , you mean that in order to initialize machines in ec2 you need to provide some external ip or you need to pass the external ip as a parameter in order for the job to run?
By applications I mean the applications (HPO, Autoscalers,...). Regarding the web UI - it's sending API calls as you browse. You can open dev tools (F12) to see the requests going out (Filter by XHR in network tab)
worker by default checks the backend every 5 seconds for new tasks in the queue. While running a task I think it basically sends whatever api calls a regular local task sends
For example artifacts or debug samples
VexedCat68 I think this will be right up your alley 🙂
https://github.com/allegroai/clearml/blob/master/examples/reporting/hyper_parameters.py#L43
StaleButterfly40 , alternatively you could use auto_connect_frameworks=False
https://clear.ml/docs/latest/docs/references/sdk/task#taskinit
So torch.save won't automatically save the model, however, you will not get the scalars/metrics automatically as well.
Hi @<1557174909573009408:profile|LargeOctopus32> , I suggest you through the introduction videos on ClearML's youtube channel
None
@<1546303277010784256:profile|LivelyBadger26> , it is Nathan Belmore's thread just above yours in the community channel 🙂
Hi @<1535793988726951936:profile|YummyElephant76> , did you use Task.add_requirements
?
None
Hi @<1523701062857396224:profile|AttractiveShrimp45> , I think this is currently by design. How would you suggest doing multiple metric optimization - priority between metrics after certain threshold is met?
RotundHedgehog76 ,
What do you mean regarding language? If I'm not mistaken ClearML should include Optuna args as well.
Also, what do you mean by commit hash? ClearML logs the commit itself but this can be changed by editing
Yeah, but how are iterations marked in the script?
Hi ImmenseMole41 , so your issue is specifically when trying to download compressed csv files? You mentioned that the values are correct when downloading via the StorageManager. Do you get corrupted values somewhere?
Also, how are you saving these csv files?
Hi @<1595587997728772096:profile|MuddyRobin9> , does the step fail or just prints this error?
Hi RobustRat47 , what if you run them as sub-processes?
You can use Task.set_base_docker
( None )
To specify arguments, there is an example there 🙂