I looked through agent-services logs and found new error I haven't seen before:clearml_agent: ERROR: Connection Error: it seems *api_server* is misconfigured. Is this the ClearML API server http://<my_ip>:8008 ?
@<1523701070390366208:profile|CostlyOstrich36>
What agent-services is doing on start up? Seems like something is preventing it from properly working. I already added a command to entrypoint to configure pip.conf since we have to use a trusted mirror to download python packages. Also I managed to connect local agent to ClearML server by using 127.0.0.1 host in credentials. Still no luck with remote agent
@<1526734383564722176:profile|BoredBat47> the agent-services is probably not configured (it needs key and secret to the clearml server to be configured in the docker-compose)
But from what you're saying it seems like the agent simply cannot communicate with the server and what you see is simply the agent waiting indefinitely
Can you try running clearml-agent --debug daemon --foreground
?
@<1523701087100473344:profile|SuccessfulKoala55>
So, I did it with debug and got this stacktrace error:type_checker=validator.TYPE_CHECKER.redefine_many({
AttributeError: type object 'Draft4Validator' has no attribute 'TYPE_CHECKER'
Also, previous problem was in incorrect proxy configuration on agent machine
@<1523701087100473344:profile|SuccessfulKoala55>
I managed to create clearml.conf file with clearml-agent init
after fixing proxy problem. And now trying to run daemon with this conf file. I suspect something is missing from it since request validator fails with missing attribute
@<1523701070390366208:profile|CostlyOstrich36>
Should I leave as is or fill the values in docker-compose for agent-services? I set it to localhost since agent-services is running together with other clearml containers on one machine. Not sure why do you have to fill those values.
CLEARML_HOST_IP: "<my_clearml_server_ip>"
CLEARML_WEB_HOST: " None "
CLEARML_API_HOST: " None "
CLEARML_FILES_HOST: " None "