Sounds like some issue with queueing the experiment. Can you provide a log of the pipeline?
Hi EcstaticMouse10 , what was your previous version?
Also, what if you try to specify id's instead of names?
Hi @<1706116294329241600:profile|MinuteMouse44> , is there any worker listening to the queue?
Hi @<1787653566505160704:profile|EnchantingOctopus35> , what are you running?
Hi @<1533257407419912192:profile|DilapidatedDucks58> , how long did it persist? Can you try upgrading again and see the apiserver logs?
Latest clearml versions in github appears to be around 1.16.3~
Hi ConvolutedSealion94 , can you please elaborate on what exactly you're trying to do? Also, I'm not sure Preprocess is part of ClearML
Hi @<1774245260931633152:profile|GloriousGoldfish63> , you can configure it in the volumes section of the fileserver
in the docker compose.
the question how does ClearML know to create env and what files does it copy to the task
Either automatically detecting the packages in requirements.txt OR using the packages listed in the task itself
Hi AbruptHedgehog21 , what are you trying to do when you're getting this message? Are you running a self hosted server?
Hi FancyTurkey50 , how did you run the agent command?
How did you setup the ClearML server?
You need to follow the instructions here - None
Can you try running with the latest version of clearml
?
Also, what if you try using only one GPU with pytorch-lightning? Still nothing is reported - i.e. console/scalars?
Hi @<1706116294329241600:profile|MinuteMouse44> , you can use CLEARML_AGENT_SKIP_PIP_VENV_INSTALL & CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL
Regular pytorch - you mean single GPU (I'm not familiar with torch distributed)?
Also just to give it a try, can you test with only 2 GPU's for example?
GiganticTurtle0 , does it pose some sort of problem? What version are you using?
Hi @<1706116294329241600:profile|MinuteMouse44> , are you running in docker mode?
What if you reduce the chunk size using --chunk-size
? Does it affect it?
Hi NaughtyHorse47 🙂
I guess it really depends on what you want to do.
Specifically for you, if you want the duration of a task you can use task.get_by_id
and look in the response for that info.
Also please expand the 500 errors you're seeing. It should give some log
Hi @<1681111528419364864:profile|SmoothGoldfish52> , it looks like there is a connectivity issue and it's failing to connect to the server. Is there something in between?
Hi CheerfulGorilla72 ,
What are you logging? Can you provide a small snippet or a screenshot?
Hi AbruptWorm50 ,
You can check in the UI, does the model miss any data there? Can you download it properly?
Hi David,
What version of ClearML server & SDK are you using?
Hi DistressedKoala73 , if you hover your mouse over the plot there should be a download plot button on the top right of the plot. Also, can you provide a small code snippet to play around with to see if it reproduces?
Have you run experiments with the SDK? i.e added Task.init()
Hi RoundMosquito25 , if you already have an instance of agent running on a queue and you run clearml-agent execute --id <task_id>
this will create another instance of the agent and will run the experiment regardless of what is happening with the other agent