Hi, I'm trying out the
clearml-agent on an Azure machine connecting to your managed server. I see the worker on the queue, and the job reaches it - but nothing is run.
I just grabbed the
pytorch mnist train experiment you guys created, and it's stuck in the beginning, it runs:
Executing: ['docker', 'run', '-t', '--gpus', 'all', '-e' ...with all the configs matching my machine, but then on the machine it immediately outputs:
Running Docker: Executing: ('docker', 'run', '-t', '--gpus', 'all', '-e', ... DONE: Running task '3f4525d56bba485791b3af87a3a11684', exit status -1Not sure how to debug this... the command to run the agent I used was:
clearml-agent daemon --queue default --docker --foreground
Can anyone help? Thanks!