Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
General Infrastructure Question: My Company Isn'T Using Aws For Training, We Have All Our Gpu'S Inhouse In Our Own Servers, We Have A Problem Where We Want On One Hand To Give All The Gpus For The Clearml-Agent (Ie' That They Will Be Available For Tasks)

general infrastructure question:
my company isn't using AWS for training, we have all our GPU's inhouse in our own servers, we have a problem where we want on one hand to give all the GPUs for the clearml-agent (ie' that they will be available for tasks) but on the other hand i want to give my developers the chance to develop on GPUS that arent being used.
the best case scenario of what i want is that the agent wouldn't give a GPU that has a task running on it, but i wasn't able to find if thats possible. my workaround that is that when my developers start their debugging process i will make a tool for them that restarts the daemon, but will change the --gpus argument to only include the gpu they arent working on (most of our servers have 2-8 gpus on them). in theory it should work, but i'm not sure what happens if lets say there is already a task running at GPU0 when a developer takes the daemon down. will the task keep running? is there a way to change which gpus are visible without taking down the daemon?
thanks!

  
  
Posted 8 months ago
Votes Newest

Answers 6


Hi @<1612982606469533696:profile|ZealousFlamingo93> , for remote development on your gpus you can use clearml-session . Otherwise you would need to spin up and down the daemons
None

  
  
Posted 8 months ago

im sorry im quite new,
do you mean if the daemon is running inside a docker container? or if the task itself is in a container?
the way i understood it, when i configure the task i set the base docker image and let it run with that

  
  
Posted 8 months ago

If you run an agent in docker mode ( --docker ) the agent will run a docker run command and the task will be executed inside a container. In that scenario, I think, if you kill the daemon then the docker will stay up and finish the job (i think, haven't tested)

  
  
Posted 8 months ago

Unless you're running in docker mode, then I think the task will continue running inside the container. Might need to check it

  
  
Posted 8 months ago

ty for the reply!
ill look into clearml-session,
but lets say for now if i stick with spinning the daemons, if i take a daemon down while a task is already running, will it stop it? or will it continue to run?

  
  
Posted 8 months ago

It will stop running

  
  
Posted 8 months ago