Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Quick Question On

Quick question on trains-agent and HPO. Say I have 10 experiments enqueued to a trains-agent . I understand the agent runs the experiment one-by-one. But can that be parallelized? Or I need to have multiple trains-agent s each with their own queues which will be running in parallel one experiment at a time?

  
  
Posted 4 years ago
Votes Newest

Answers 8


Great, thanks!

  
  
Posted 4 years ago

(Do notice that even though you can spin two agents on the same GPU, the nvidia drivers cannot share allocated GPU memory, so if one Task consumes too much memory the other will not have enough free GPU memory to run)

Basically the same restriction as manually launching two processes using the same GPU

That makes sense. Currently, I use python multiprocessing to launch multiple experiments on the sam GPU device. I'm guessing using trains-agent will be similar

  
  
Posted 4 years ago

Correct šŸ™‚
You can spin it in two modes, either venv or docker (notice that even in docker mode, it will still clone the code into the docker and install the packages inside the docker, but it also inherits from the docker preinstalled system packages, so that the installation process is a lot faster, but you have the ability to change packages without having to build an entire new docker image)

  
  
Posted 4 years ago

Hi SarcasticSparrow10
You will need to habe multiple trains-agent s but they will be sharing the same queue (i.e. pulling jobs from the same queue the HPO process is pushing to)
Make sense ?

  
  
Posted 4 years ago

Got it. I haven't tried setting up trains-agent yet so I don't know much about the overhead of launching the agent. I'd imagine if it has to create the full environment (installing requirements, etc), the overhead might not be that low. But as I'm reading, it looks like I can use a docker image with the full env. Is my understanding correct?

  
  
Posted 4 years ago

Basically it solves the remote-execution problem, so you can scale to multiple machines relatively easy :)

  
  
Posted 4 years ago

You will need to habe multipleĀ 

trains-agent

sĀ  but they will be sharing the same queue (i.e. pulling jobs from the same queue the HPO process is pushing to)
Make sense ?

Hmm. So say I have a parameter NUM_PARALLEL_EXECUTIONS , I can programmatically launch that many trains-agent for every optimization run?!

  
  
Posted 4 years ago

(Do notice that even though you can spin two agents on the same GPU, the nvidia drivers cannot share allocated GPU memory, so if one Task consumes too much memory the other will not have enough free GPU memory to run)

Basically the same restriction as manually launching two processes using the same GPU

  
  
Posted 4 years ago