Hi BoredBat47 , use the --foreground tag to see the logs 🙂
Sorry, forgot to mention. I used the command with --foreground tag. It is the same. Terminal just sits at a new line, no logs, no worker in UI
The strange thing also is that I see that the credentials are being used in web UI: last used timestamp is updated constantly to present time. So apparently daemon is trying to do something but can't launch properly all the way
What version of clearml
and clearml-agent
are you using, what OS? Can you add the line you're running for the agent?
clearml 1.9.0
clearml-agent 1.5.1
NAME="Ubuntu"
VERSION="18.04.6 LTS (Bionic Beaver)"
~/.local/bin/clearml-agent daemon --foreground
CostlyOstrich36 Seems like on my server agent-services container is missing. It's not running. Could it be the issue?
CostlyOstrich36 Yep, it seems it was the case. I did not provide credentials for API in docker compose. I did that but now agent-services just keeps restarting. I looked into containers logs and it seems to be a proxy error. Why this container is trying to connect somewhere?
Also services agent is not related to regular agent executions
CostlyOstrich36 Am I right I should also provide this URLS in agent-services section in docker-compose file?
CLEARML_HOST_IP: ${CLEARML_HOST_IP:-}
CLEARML_WEB_HOST: ${CLEARML_WEB_HOST:-}
CLEARML_API_HOST: http://apiserver:8008
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 100k 100 100k 0 0 10236 0 0:00:10 0:00:10 --:--:-- 21354 Warning: Transient problem: HTTP error Will retry in 10 seconds. 10 retries Warning: left. 100 100k 100 100k 0 0 10237 0 0:00:10 0:00:10 --:--:-- 21345 Warning: Transient problem: HTTP error Will retry in 10 seconds. 9 retries Warning: left. 100 100k 100 100k 0 0 10238 0 0:00:10 0:00:10 --:--:-- 21345 Warning: Transient problem: HTTP error Will retry in 10 seconds. 8 retries Warning: left. 100 100k 100 100k 0 0 10237 0 0:00:10 0:00:10 --:--:-- 26965 Warning: Transient problem: HTTP error Will retry in 10 seconds. 7 retries Warning: left. 100 100k 100 100k 0 0 10237 0 0:00:10 0:00:10 --:--:-- 26958 Warning: Transient problem: HTTP error Will retry in 10 seconds. 6 retries Warning: left. 100 100k 100 100k 0 0 10236 0 0:00:10 0:00:10 --:--:-- 26951 Warning: Transient problem: HTTP error Will retry in 10 seconds. 5 retries Warning: left. 100 100k 100 100k 0 0 10236 0 0:00:10 0:00:10 --:--:-- 26958 Warning: Transient problem: HTTP error Will retry in 10 seconds. 4 retries Warning: left. 100 100k 100 100k 0 0 10235 0 0:00:10 0:00:10 --:--:-- 26951 Warning: Transient problem: HTTP error Will retry in 10 seconds. 3 retries Warning: left. 100 100k 100 100k 0 0 10237 0 0:00:10 0:00:10 --:--:-- 26965 Warning: Transient problem: HTTP error Will retry in 10 seconds. 2 retries Warning: left. 100 100k 100 100k 0 0 10237 0 0:00:10 0:00:10 --:--:-- 26965 Warning: Transient problem: HTTP error Will retry in 10 seconds. 1 retries Warning: left. 100 100k 100 100k 0 0 10237 0 0:00:10 0:00:10 --:--:-- 26965
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead:
BoredBat47 what did you provide in the docker-compose to the services agent?
Also, you said that clearml-init
worked but clearml-agent init
did not - did you run both from the same place?
@<1523701087100473344:profile|SuccessfulKoala55> I provided following env vars:
CLEARML_HOST_IP: "<my_ip>"
CLEARML_WEB_HOST: " http://<my_ip>:8080 "
CLEARML_API_HOST: " http://<my_ip>:8008 "
CLEARML_FILES_HOST: " http://<my_ip>:8081 "
CLEARML_API_ACCESS_KEY: <my_access_key>
CLEARML_API_SECRET_KEY: <my_secret_key>
also I changed IP in entrypoint from apiserver:8008 to <my_ip>:8008
Yes, I run both commands from the same place — dedicated user on my worker machine. Is clearml-init also has to connect to the ClearML server to successfully finish?
Sorry for bothering but I am really lost, I think I exhausted all my options. I really have no clue what is going on.
Is clearml-init also has to connect to the ClearML server to successfully finish?
Yes, it verifies the credentials in the same way, and creates a clearml.conf file when done
do you have this file in your home folder?
@<1523701087100473344:profile|SuccessfulKoala55>
When I run clearml-agent init
I don't have a file prior to this. I tried running agent daemon with clearml.conf
created by clearml-init
but that doesn't work since it has no agent section, right? I know I can add it myself but I think clearml-agent init
should function too
Actually the agent will use the default values for the agent section if you have a clearml.init file - what do you get if you run the agent like that?
It works like I mentioned before: the terminal jumps on a new line and sits there, no output after that, nothing is happening in the console. But if you go to UI you see that "Last used" is updating
Can you please attached the console output again?
Console output of clearml-agent daemon --foreground
?
Console output of clearml-agent init
with no clearml.conf:
...ClearML Hosts configuration:
Web App:
NoneAPI:
NoneFile Store:
None
Verifying credentials ...
Error: could not verify credentials: key=ak secret=sk
...
Console output of clearml-agent daemon --foreground
with clearml.conf created by clearml-init
is missing. No output.
...