Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
Hi Everyone, Looking For Ml Management Tools I Stumbled Upon Trains, I Must Say It Has Been Awesome So Far. I Just Have A (Probably Stupid) Question: I'M Trying To Setup A Multi-Node Training Environment And I Thought I Could Solve This With Agents, But A


SmilingFrog76

there is no internal scheduler in Trains

So obviously there is a scheduler built into Trains, this is the queues (order / priority)
What is missing from it is multi node connection, e.g. I need two agents running the exact same job working together.
(as opposed to, I have two jobs, execute them separately when a resource is available)

Actually my suggestion was to add a SLURM integration, like we did with k8s (I'm not suggesting Kubernetes as a solution for you, the opposite, k8s does not have the kind of scheduler you are looking for, only SLURM has it)

There is also a middle-ground between 2 & 3.
Let's assume we clone an experiment in the UI, and configured it, this Task has a unique ID (press on the ID button next to the name to see it).
We could schedule a slurm job that basically runs the following command on two nodes:

trains-agent execute --full-monitoring --id <my_task_id_here>This way we get the benefit of having the ability to change arguments, and have trains-agent setup the environement / code for us, and use SLURM to schedule the actual job.

What do you think?

BTW:(Option (3) is basically automating this exact procedure.

  
  
Posted 4 years ago
148 Views
0 Answers
4 years ago
2 years ago