Hi GiganticTurtle0 ,
My favorite is ps -ef | grep clearml-agent
and after kill -9 <agent pid>
After doing so the agent is removed from the list provided by ps -ef | grep clearml-agent
, but it is still visible from the ClearML UI and also when I run clearml-agent list
GiganticTurtle0 adding --stop to the exact daemon execution will stop it (meaning if you have multiple agents on the same machine launched with different parameters, just add the --stop to retire the specific one)
But how can I reference that exact daemon execution? I tried with the ID but it fails:
clearml-agent daemon AGENT_ID --stop
Hmmm that is a good use case to have (maybe we should have --stop get an argument ?)
Meanwhile you can do$ clearml-agent daemon --gpus 0 --queue default $ clearml-agent daemon --gpus 1 --queue default then to stop only the second one: $ clearml-agent daemon --gpus 1 --queue default --stop
wdyt?
Sure, it would be very intuitive if the command to stop an agent would be as easy as:clearml-agent daemon --stop AGENT_PID
GiganticTurtle0 can you please add a github issue with feature request to clearml-agent? I think this is a great use case!
That would be a very useful feature.
What is the status of that issue? I havn't found it on github.
Maybe this one?
https://github.com/allegroai/clearml/issues/448
I think it is already there (i.e. 1.1.1)