Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, I'M Trying Out The

Hi, I'm trying out the clearml-agent on an Azure machine connecting to your managed server. I see the worker on the queue, and the job reaches it - but nothing is run.
I just grabbed the pytorch mnist train experiment you guys created, and it's stuck in the beginning, it runs:
Executing: ['docker', 'run', '-t', '--gpus', 'all', '-e' ...with all the configs matching my machine, but then on the machine it immediately outputs:
Running Docker: Executing: ('docker', 'run', '-t', '--gpus', 'all', '-e', ... DONE: Running task '3f4525d56bba485791b3af87a3a11684', exit status -1Not sure how to debug this... the command to run the agent I used was: clearml-agent daemon --queue default --docker --foreground
Can anyone help? Thanks!

  
  
Posted 3 years ago
Votes Newest

Answers 7


in the agent’s clearml.conf file, set agent.docker_force_pull to true .
You can also try in the machine running the ClearML agent to run:
docker pull nvidia/cuda:10.1-runtime-ubuntu18.04

  
  
Posted 3 years ago

CleanPigeon16 , just making sure, docker is installed and configured on the host machine (i.e. Azure machine)?

  
  
Posted 3 years ago

CLEARML_DOCKER_IMAGE=nvidia/cuda:10.1-runtime-ubuntu18.04
How do I pull the image using the agent?

  
  
Posted 3 years ago

which docker image do you use? can you try pulling the image manually?

  
  
Posted 3 years ago

Right... apparently the nvidia-docker wasn't set up. Thanks!

  
  
Posted 3 years ago

nope, the experiment is stuck in RUNNING state

  
  
Posted 3 years ago

Hi CleanPigeon16 .

Do you get anything in the UI regarding this failure (in the RESULTS -> CONSOLE section)?

  
  
Posted 3 years ago
1K Views
7 Answers
3 years ago
one year ago
Tags
Similar posts