Another Message it gets stuck at sometimes is
cp: -r not specified; omitting directory '/tmp/clearml.conf'
Using built-in ClearML default key/secret
and this is the log for it
@<1559711593736966144:profile|SoggyCow20> clearml session must be run using the clearml-session
command line took, you canno't run it manually or enqueue it yourself
@<1559711593736966144:profile|SoggyCow20> how did you run the session task?
clearml-session will automatically enqueue a task to the queue you specify
I guess there must have been some confusion, I am not trying to run a clearml-session, or a session, I am trying to execute a task remotely, such as Clone of Scalar reporting example. When I try to clone it and execute it to a queue, the docker image is getting stuck after the pip installations. Which works if i connect to app.clear.ml, but does not work when i connect to localhost docker deployment
@<1559711593736966144:profile|SoggyCow20> I see this in the log Process terminated by user
- how exactly did you run this session task?
Hi @<1559711593736966144:profile|SoggyCow20> , can you attach the full log?
the command to set up the agent is
clearml-agent daemon --docker pytorch_test --queue default --foreground --cpu-only
@<1523701087100473344:profile|SuccessfulKoala55> Update - I tried to also run ClearML on my other personal laptop, I still do face the same issue where the docker agent gets stuck after some pip installations, even after being left to run for 10 minutes at that state, kindly find the attached log,
this is the command
clearml-agent daemon --cpu-only --docker pytorch/pytorch --queue default
Hi @<1523701087100473344:profile|SuccessfulKoala55> !
many thanks for your reply!
kindly find the attached log
To provide you with more info, I am setting up clearml on a linux server, and the agent is also set up on the same server.
I even tried this on the local machine MACOS with an agent on the same machine, and had the same issue.
@<1523701087100473344:profile|SuccessfulKoala55> i stopped the runner execution by Ctrl+C in the terminal
@<1523701087100473344:profile|SuccessfulKoala55> I found out I had some issues with the clearml_agent docker setup, lacking some environment variables. What should be the value for CLEARML_HOST_IP?
One more update, I tried running the execute_remotely from app.clear.ml with a runner on local machine and with docker. it worked fine
The docker image getting stuck does not have to do with the pip warning, I managed to surpress that through an environment variable
I did not need to launch it however when I was using app-clearml, why is it needed when running things locally?
@<1523701087100473344:profile|SuccessfulKoala55> is clearml-session to initiate a queue?
@<1523701087100473344:profile|SuccessfulKoala55> apologies for my late replies. I tried to run it through the frontend WebAPP, and I also tried to use the clearml/examples/advanced/execute_remotely.py