@<1523701087100473344:profile|SuccessfulKoala55> I found out I had some issues with the clearml_agent docker setup, lacking some environment variables. What should be the value for CLEARML_HOST_IP?
I guess there must have been some confusion, I am not trying to run a clearml-session, or a session, I am trying to execute a task remotely, such as Clone of Scalar reporting example. When I try to clone it and execute it to a queue, the docker image is getting stuck after the pip installations. Which works if i connect to app.clear.ml, but does not work when i connect to localhost docker deployment
I did not need to launch it however when I was using app-clearml, why is it needed when running things locally?
clearml-session will automatically enqueue a task to the queue you specify
@<1523701087100473344:profile|SuccessfulKoala55> is clearml-session to initiate a queue?
@<1559711593736966144:profile|SoggyCow20> clearml session must be run using the clearml-session
command line took, you canno't run it manually or enqueue it yourself
One more update, I tried running the execute_remotely from app.clear.ml with a runner on local machine and with docker. it worked fine
@<1523701087100473344:profile|SuccessfulKoala55> Update - I tried to also run ClearML on my other personal laptop, I still do face the same issue where the docker agent gets stuck after some pip installations, even after being left to run for 10 minutes at that state, kindly find the attached log,
this is the command
clearml-agent daemon --cpu-only --docker pytorch/pytorch --queue default
the command to set up the agent is
clearml-agent daemon --docker pytorch_test --queue default --foreground --cpu-only
@<1523701087100473344:profile|SuccessfulKoala55> apologies for my late replies. I tried to run it through the frontend WebAPP, and I also tried to use the clearml/examples/advanced/execute_remotely.py
@<1559711593736966144:profile|SoggyCow20> how did you run the session task?
@<1523701087100473344:profile|SuccessfulKoala55> i stopped the runner execution by Ctrl+C in the terminal
@<1559711593736966144:profile|SoggyCow20> I see this in the log Process terminated by user
- how exactly did you run this session task?
Another Message it gets stuck at sometimes is
cp: -r not specified; omitting directory '/tmp/clearml.conf'
Using built-in ClearML default key/secret
and this is the log for it
To provide you with more info, I am setting up clearml on a linux server, and the agent is also set up on the same server.
I even tried this on the local machine MACOS with an agent on the same machine, and had the same issue.
Hi @<1523701087100473344:profile|SuccessfulKoala55> !
many thanks for your reply!
kindly find the attached log
Hi @<1559711593736966144:profile|SoggyCow20> , can you attach the full log?
The docker image getting stuck does not have to do with the pip warning, I managed to surpress that through an environment variable