Hi OutrageousSheep60 , do you mean to make it disappear from the UI?
yes - and removed fromclearml-agent list
When you stop a daemon service, it will stop reporting to the server. There's a timeout of 10min, after which a daemon will not be displayed in the server
by the way, if you stop a daemon in an orderly way, it should remove itself, I think...
Strange
I ranclearml-agent daemon --stop
and after 10 min I ranclearml-agent list
and I still see a worker
OutrageousSheep60 , what version of ClearML-Agent
are you using?
Do you have any other workers running?
not sure I understand
runningclearml-agent list
I get
`
workers:
- company:
id: d1bd92...1e52b
name: clearml
id: clearml-server-...wdh:0
ip: x.x.x.x
... `
Also, what version are you on?
of what?
Distributor ID: Ubuntu
Description: Ubuntu 20.04.4 LTS
Release: 20.04Codename: focal
Can I assume you're running the agent (in daemon mode) on the same machine that you're running the clearml-agent daemon --stop
command?
Can you try upgrading to the latest agent version? pip install -U clearml-agent
Also, can you verify that you still have the clearml-agent process running? top
/ htop
we reinstalled the clearml-agent$clearml-agent --version CLEARML-AGENT version 1.2.3
running top | grep clearml
we can see the agent running
running clearml-agent list
we can see 2 workers
before running clearml-agent daemon --stop
We updated the clearml.conf and updated the worker_id
and worker_name
with the relevant name/id that we can see from clearml-agent list
and we getCould not find a running clearml-agent instance with worker_name=<clearml_worker_name> worker_id=<clearml_worker_id:0>
As we understand the --stop
without any id's should stop all the workers.
waited 10 minrunning top | grep clearml
we can see the clearml-agent running
running clearml-agent list
we can see the 2 workers
Please advise on how to remove a worker
Can you try with blank worker_id/work_name in your clearml.conf
(basically how it was before)?
You can force kill the agent using kill -9 <process_id>
but clearml-agent daemon stop should work.
Also, can you verify that one of the daemons is the clearml-services daemon? This one should be running from inside a docker on your server machine (I'm guessing you're self hosting - correct?).
updated the clearml.conf
with empty worker_id/name ran
clearml-agent daemon --stop
top | grep clearmKilled the pidsran
clearml-agent list
still both of the workers are listed
Did you wait 10-15~ mins for it to time out?
If you killed all processes directly, there can't be any workers on that machine. It means that these two workers are running somewhere else...
Well it seems that we have similar https://github.com/allegroai/clearml-agent/issues/86
currently we are just creating a new worker and on a separate queue