Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi Everyone! I'M Having A Trouble With Executing Tasks Remotely From Web Ui Using Clearml Agent Docker Mode. (With Self-Hosted Server On Windows Computer) I Enqueued A Training Task From Web Ui, And The Docker Container Is Always "Running", But The Acutal

Hi everyone!
I'm having a trouble with executing tasks remotely from web UI using clearml agent docker mode. (with self-hosted server on Windows computer)
I enqueued a training task from web UI, and the docker container is always "running", but the acutal task is not progressing at all.
The command I used for running the agent is clearml-agent daemon --queue default --docker

I've used yolov5 model, so the default docker image is ultralytics/yolov5:latest.
Here are the config and log files.
I'd incredibly appreciate your help!

  
  
Posted 3 months ago
Votes Newest

Answers 5


Sorry for the late reply.
I believe this is the why it's not working (from console log):

adfba156d16e: Pull complete
Digest: sha256:0ce15c07d55860dfd2eeae535c42d85383a664821da5ff18d10448b5a2993e5a
Status: Downloaded newer image for ultralytics/yolov5:latest
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.
  
  
Posted 3 months ago

Hi @<1664079296102141952:profile|DangerousStarfish38> , looks like an issue with docker on your machine. Are you able to run that container manually?

  
  
Posted 3 months ago

You mean it simply hangs?

  
  
Posted 3 months ago

I've figured out what's wrong and fixed it! Thanks!

  
  
Posted 3 months ago

@<1664079296102141952:profile|DangerousStarfish38> can you get the logs for the docker container?

  
  
Posted 3 months ago