AgitatedDove14 while playing (and documenting) the way to run clearml dockerized on the local machine, I noticed that the yml file https://github.com/allegroai/clearml-server/blob/master/docker/docker-compose.yml containsCLEARML_API_HOST:
http://apiserver:8008
I duplicated this configration (agent-services) section and adapted it to run the default queue hagent with the image allegroai/clearml-agent:latest
I hoped to have GPU support by this but so far haven't seen the GPU usage line plot ...
I see an error in the results page when cloning an experiment2021-01-24 13:17:18,557 - clearml.metrics - WARNING - Failed uploading to
(HTTPConnectionPool(host='apiserver', port=8081): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff7f0386f98>: Failed to establish a new connection: [Errno 111] Connection refused',))) 2021-01-24 13:17:18,557 - clearml.metrics - ERROR - Not uploading 1/10 events because the data upload failed Test set: Average loss: 0.1259, Accuracy: 9599/10000 (96%) 9920512it [00:38, 257096.64it/s]
might it be, that the configuration in the yml is wrong as it refers to an unknown apiserver url?
Should it beCLEARML_API_HOST: http://${CLEARML_HOST_IP}:8008
?