Reputation
Badges 1
18 × Eureka!aws {
s3 {
# S3 credentials, used for read/write access by various SDK elements
# The following settings will be used for any bucket not specified below in the "credentials" section
# ---------------------------------------------------------------------------------------------------
region: "ae-ad-1"
# Specify explicit keys
key: "123456"
secret: "123456"
bucket: " [None](s3://inference) "
...
Ohh! now it works! thank you so much @<1523701205467926528:profile|AgitatedDove14> !!
in the command I specify the following
--storage s3//:inference
@<1523701087100473344:profile|SuccessfulKoala55> Update - I tried to also run ClearML on my other personal laptop, I still do face the same issue where the docker agent gets stuck after some pip installations, even after being left to run for 10 minutes at that state, kindly find the attached log,
this is the command
clearml-agent daemon --cpu-only --docker pytorch/pytorch --queue default
Another Message it gets stuck at sometimes is
cp: -r not specified; omitting directory '/tmp/clearml.conf'
Using built-in ClearML default key/secret
and this is the log for it
@<1523701087100473344:profile|SuccessfulKoala55> i stopped the runner execution by Ctrl+C in the terminal
I guess there must have been some confusion, I am not trying to run a clearml-session, or a session, I am trying to execute a task remotely, such as Clone of Scalar reporting example. When I try to clone it and execute it to a queue, the docker image is getting stuck after the pip installations. Which works if i connect to app.clear.ml, but does not work when i connect to localhost docker deployment
One more update, I tried running the execute_remotely from app.clear.ml with a runner on local machine and with docker. it worked fine
@<1523701087100473344:profile|SuccessfulKoala55> is clearml-session to initiate a queue?
Many thanks for your quick response @<1523701205467926528:profile|AgitatedDove14> !
Oh that is copy pasting done wrong inside the clearml.conf!
now it works!
To provide you with more info, I am setting up clearml on a linux server, and the agent is also set up on the same server.
I even tried this on the local machine MACOS with an agent on the same machine, and had the same issue.
The docker image getting stuck does not have to do with the pip warning, I managed to surpress that through an environment variable
the command to set up the agent is
clearml-agent daemon --docker pytorch_test --queue default --foreground --cpu-only
@<1523701087100473344:profile|SuccessfulKoala55> apologies for my late replies. I tried to run it through the frontend WebAPP, and I also tried to use the clearml/examples/advanced/execute_remotely.py
Hi @<1523701087100473344:profile|SuccessfulKoala55> !
many thanks for your reply!
kindly find the attached log
I did not need to launch it however when I was using app-clearml, why is it needed when running things locally?
@<1523701087100473344:profile|SuccessfulKoala55> I found out I had some issues with the clearml_agent docker setup, lacking some environment variables. What should be the value for CLEARML_HOST_IP?