Do we even have an option to assign id to each agent? https://clear.ml/docs/latest/docs/clearml_agent/clearml_agent_daemon
OK, I think what you need to do is scale up the number of apiserver worker processes - pass the
CLEARML_USE_GUNICORN=1 environment variable to the apiserver service, this should start 8 processes (by default) instead of one, and see if it helps. By the way, while this number (number of processes) can be set even higher, at some point, I assume you'll start having issues with load on the elasticsearch service, which is not that easy to scale up.
SuccessfulKoala55 We are encountering some strange problem. We are spinning N agents using script, in a loop
But not all agents are visible as workers (we check it both in UI, but also running
workers_list = client.workers.get_all() ).
Do you think that is it possibility that too much of them are connecting at once and we can solve that by setting a delay between running subsequent agents?
SuccessfulKoala55 hmm, we are trying to do something like that and we are encountering problems. We are doing big hyperparameter optimization on 200 workers and some tasks are failing (while with less workers they are not failing). Also, UI also has some problems with that. Maybe there are some settings that should be corrected in comparison to classic configuration?