Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, In My Setup I Run Multiple Experiments In Parallel From The Same Script. I Understand That There Can Only Be One Execution

Hi,
In my setup I run multiple experiments in parallel from the same script. I understand that there can only be one execution Task in a script. I would like trains to log each of those experiments separately. How can I do that when I can only initialize Task just once?
Thanks,

  
  
Posted 3 years ago
Votes Newest

Answers 15


For HPO (hyper-param opt), are all experiments which are part of the optimization process logged? I understand the HPO process takes a base experiment and runs subsequent experiments with the new HPs. Are these experiments logged too (with the train-valid curves, etc)?

  
  
Posted 3 years ago

Mostly they are a set of user defined hyper-parameters. I've been reading about hyper-param optimization since posting this. It seems like I would have to use hyper-param opt to achieve that.

  
  
Posted 3 years ago

Hi SourSwallow36
What do you man by Log each experiment separately ? How would you differentiate between them?

  
  
Posted 3 years ago

Yes every run is log as a new experiment (with it's own set of HP). Do notice that the execution itself is done by the "trains-agent". Meaning the HP process creates experiments with new set of HP an dputs them into the execution queue, then 

trains-agent

 pulls them from the queue and starts executing them. You can have multiple 

trains-agent

 on as many machines as you like with specific GPUs etc. each one will pull a single experiment and execute it, once it is done it will pull the next one etc.

Oh ok! So if I have the base experiment say 'mnist1' and I run HPO which executes 10 experiments. Now will these 10 experiments be of different names? How will I know these are part of the 'mnist1' HPO case?

  
  
Posted 3 years ago

how are you thinking of running those HP tests?

I'm not sure if I completely understand the question. Here is what I do presently. This maybe achieved more efficiently in trains (that's why I'm trying to move to trains).

Example:
I have a set of 10 user defined HPs. I have a scheduler that runs them independently in parallel. Once the training is complete, I run inference on the test set for these experiments. The data for both training and inference is logged under the respective experiment (which are 10 in this case).

So I'm trying to emulate this process in trains.

  
  
Posted 3 years ago

SourSwallow36 okay, let's assume we have the base experiment (the original one before the HP process).
What we do is we clone that experiment (either in UI or with code or with code automation, aka HP optimizer. Then each clone of the original gets a set of new HP, then we enqueue the 10 experiments into the execution queue. In parallel, we run trains-agent on a machine, and connect it to the queue. It will pull the experiments, one after the other, run them and log their results. We will end with 10 "completed" experiments.
Make sense?

  
  
Posted 3 years ago

Now will these 10 experiments be of different names? How will I know these are part of the 'mnist1' HPO case?

Yes (they will have the specific HP name/value combination).
FYI names are not unique so in theory you could have multiple experiments with the same name.

If you look under the Configuration Tab, you will find all the configuration arguments for the experiment. You can also add specific arguments to the experiment table (click the cogwheel at the right top corner, and select +hyper-parameters)

  
  
Posted 3 years ago

Great, yes that makes sense.

  
  
Posted 3 years ago

Ok, cool. Thanks. This clears up things. I need to read more about the trains agent then. I have another question, I'll post it as a separate thread.

  
  
Posted 3 years ago

This is an example of hoe one can clone an experiment and change it from code:
https://github.com/allegroai/trains/blob/master/examples/automation/task_piping_example.py

A full HPO optimization process (basically the same idea only with optimization algorithms deciding on the next set of parameters) is also available:
https://github.com/allegroai/trains/blob/master/examples/optimization/hyper-parameter-optimization/hyper_parameter_optimizer.py

  
  
Posted 3 years ago

Obviously if you click on them you will be able to compare based on specific metric / parameters (either as table or in parallel coordinates)

  
  
Posted 3 years ago

Well that depends on how you think about the automation. If you are running your experiments manually (i.e. you specifically call/execute them), then at the beginning of each experiment (or function) call Task.init and when you are done call Task.close . This can be done in parallel if you are running them from separate processes.
If you want to automate the process, you can start using the trains-agent which could help you spin those experiments on as many machines as you like 🙂

  
  
Posted 3 years ago

Are these experiments logged too (with the train-valid curves, etc)?

Yes every run is log as a new experiment (with it's own set of HP). Do notice that the execution itself is done by the "trains-agent". Meaning the HP process creates experiments with new set of HP an dputs them into the execution queue, then trains-agent pulls them from the queue and starts executing them. You can have multiple trains-agent on as many machines as you like with specific GPUs etc. each one will pull a single experiment and execute it, once it is done it will pull the next one etc.

SourSwallow36 how are you thinking of running those HP tests?

  
  
Posted 3 years ago
615 Views
15 Answers
3 years ago
one year ago
Tags
Similar posts