Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
General Infrastructure Question: My Company Isn'T Using Aws For Training, We Have All Our Gpu'S Inhouse In Our Own Servers, We Have A Problem Where We Want On One Hand To Give All The Gpus For The Clearml-Agent (Ie' That They Will Be Available For Tasks)

general infrastructure question:
my company isn't using AWS for training, we have all our GPU's inhouse in our own servers, we have a problem where we want on one hand to give all the GPUs for the clearml-agent (ie' that they will be available for tasks) but on the other hand i want to give my developers the chance to develop on GPUS that arent being used.
the best case scenario of what i want is that the agent wouldn't give a GPU that has a task running on it, but i wasn't able to find if thats possible. my workaround that is that when my developers start their debugging process i will make a tool for them that restarts the daemon, but will change the --gpus argument to only include the gpu they arent working on (most of our servers have 2-8 gpus on them). in theory it should work, but i'm not sure what happens if lets say there is already a task running at GPU0 when a developer takes the daemon down. will the task keep running? is there a way to change which gpus are visible without taking down the daemon?
thanks!

  
  
Posted one year ago
Votes Newest

Answers 6


If you run an agent in docker mode ( --docker ) the agent will run a docker run command and the task will be executed inside a container. In that scenario, I think, if you kill the daemon then the docker will stay up and finish the job (i think, haven't tested)

  
  
Posted one year ago

im sorry im quite new,
do you mean if the daemon is running inside a docker container? or if the task itself is in a container?
the way i understood it, when i configure the task i set the base docker image and let it run with that

  
  
Posted one year ago

Unless you're running in docker mode, then I think the task will continue running inside the container. Might need to check it

  
  
Posted one year ago

It will stop running

  
  
Posted one year ago

Hi @<1612982606469533696:profile|ZealousFlamingo93> , for remote development on your gpus you can use clearml-session . Otherwise you would need to spin up and down the daemons
None

  
  
Posted one year ago

ty for the reply!
ill look into clearml-session,
but lets say for now if i stick with spinning the daemons, if i take a daemon down while a task is already running, will it stop it? or will it continue to run?

  
  
Posted one year ago