Is clearml-init also has to connect to the ClearML server to successfully finish?
Yes, it verifies the credentials in the same way, and creates a clearml.conf file when done
Also, previous problem was in incorrect proxy configuration on agent machine
Hi BoredBat47 , use the --foreground tag to see the logs 🙂
CostlyOstrich36 Yep, it seems it was the case. I did not provide credentials for API in docker compose. I did that but now agent-services just keeps restarting. I looked into containers logs and it seems to be a proxy error. Why this container is trying to connect somewhere?
What version of clearml and clearml-agent are you using, what OS? Can you add the line you're running for the agent?
CostlyOstrich36 Seems like on my server agent-services container is missing. It's not running. Could it be the issue?
clearml 1.9.0
clearml-agent 1.5.1
NAME="Ubuntu"
VERSION="18.04.6 LTS (Bionic Beaver)"
Console output of clearml-agent init with no clearml.conf:
...ClearML Hosts configuration:Web App: NoneAPI: NoneFile Store: None
Verifying credentials ...Error: could not verify credentials: key=ak secret=sk
...
Console output of clearml-agent daemon --foreground with clearml.conf created by clearml-init is missing. No output.
...
It works like I mentioned before: the terminal jumps on a new line and sits there, no output after that, nothing is happening in the console. But if you go to UI you see that "Last used" is updating
Sorry for bothering but I am really lost, I think I exhausted all my options. I really have no clue what is going on.
@<1523701087100473344:profile|SuccessfulKoala55> I provided following env vars:
CLEARML_HOST_IP: "<my_ip>"
CLEARML_WEB_HOST: " http://<my_ip>:8080 "
CLEARML_API_HOST: " http://<my_ip>:8008 "
CLEARML_FILES_HOST: " http://<my_ip>:8081 "
CLEARML_API_ACCESS_KEY: <my_access_key>
CLEARML_API_SECRET_KEY: <my_secret_key>
also I changed IP in entrypoint from apiserver:8008 to <my_ip>:8008
Yes, I run both commands from the same place — dedicated user on my worker machine. Is clearml-init also has to connect to the ClearML server to successfully finish?
@<1523701087100473344:profile|SuccessfulKoala55>
When I run clearml-agent init I don't have a file prior to this. I tried running agent daemon with clearml.conf created by clearml-init but that doesn't work since it has no agent section, right? I know I can add it myself but I think clearml-agent init should function too
@<1523701087100473344:profile|SuccessfulKoala55>
I managed to create clearml.conf file with clearml-agent init after fixing proxy problem. And now trying to run daemon with this conf file. I suspect something is missing from it since request validator fails with missing attribute
I looked through agent-services logs and found new error I haven't seen before:clearml_agent: ERROR: Connection Error: it seems *api_server* is misconfigured. Is this the ClearML API server http://<my_ip>:8008 ?
Can you try running clearml-agent --debug daemon --foreground ?
do you have this file in your home folder?
Also services agent is not related to regular agent executions
But from what you're saying it seems like the agent simply cannot communicate with the server and what you see is simply the agent waiting indefinitely
Sorry, forgot to mention. I used the command with --foreground tag. It is the same. Terminal just sits at a new line, no logs, no worker in UI
Console output of clearml-agent daemon --foreground ?
Can you please attached the console output again?
@<1523701070390366208:profile|CostlyOstrich36>
What agent-services is doing on start up? Seems like something is preventing it from properly working. I already added a command to entrypoint to configure pip.conf since we have to use a trusted mirror to download python packages. Also I managed to connect local agent to ClearML server by using 127.0.0.1 host in credentials. Still no luck with remote agent
The strange thing also is that I see that the credentials are being used in web UI: last used timestamp is updated constantly to present time. So apparently daemon is trying to do something but can't launch properly all the way
@<1523701070390366208:profile|CostlyOstrich36>
Should I leave as is or fill the values in docker-compose for agent-services? I set it to localhost since agent-services is running together with other clearml containers on one machine. Not sure why do you have to fill those values.
CLEARML_HOST_IP: "<my_clearml_server_ip>"
CLEARML_WEB_HOST: " None "
CLEARML_API_HOST: " None "
CLEARML_FILES_HOST: " None "
@<1523701087100473344:profile|SuccessfulKoala55>
So, I did it with debug and got this stacktrace error:type_checker=validator.TYPE_CHECKER.redefine_many({AttributeError: type object 'Draft4Validator' has no attribute 'TYPE_CHECKER'