Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi! In My Project I Need To Run A Lot Of Experiments On Different Subsets Of My Trainset, Collect Score And Perform Some Calculations Based On It. I Have

Hi!
In my project I need to run a lot of experiments on different subsets of my trainset, collect score and perform some calculations based on it. I have main.py , from which i call lots of train_on_subset(subset) . I want to 1) gather statistics of every call of train_on_subset and 2) use trains agent to queue those calls.

I came across number of difficulties trying to connect my code to trains framework. First, I perform many experiments in one process, so I can't create new task in every call of train_on_subset using Task.init , but I still need to track the progress of each experiment separately. Second, I cannot move train_on_subset to separate .py file and run it as console script, because I need to push a lot of parameters into it including model, and also need to get my score back to process it later in main.py

Please, let me know what's the best practice / architecture to connect your framework to my project. Any advice is highly appreciated. Thanks in advance!

  
  
Posted 3 years ago
Votes Newest

Answers 5


Hi UpsetCrocodile10

First, I perform many experiments in one process, ...

How about this one:
https://github.com/allegroai/trains/issues/230#issuecomment-723503146
Basically you could utilize create_function_task
This means you have Task.init() on the mainn "controller" and each "train_in_subset" as a "function_task". Them the controller can wait on them, and collect the data (like the HPO does.

Basically:
` controller_task = Task.init(...)
children = []
for i, s in enumerate(my_subset):
child = task.create_function_task(my_train_func, arguments=s, func_name='subset_{}'.format(i))
children.append(child)

for child in children:
child.reload()
print(child.get_last_scalar_metrics())
sleep(5.0) `What do you think?

  
  
Posted 3 years ago

UpsetCrocodile10

Does this method expect 

my_train_func

 to be in the same file as

As long as you import it and you can pass it, it should work.

Child exp get's aborted immediately ...

It seems it cannot find the file "main.py" , it assumes all code is part of a single repository, is that the case ? What do you have under the "Execution" tab for the experiment ?

  
  
Posted 3 years ago

Hi UpsetCrocodile10

execute them and return scalars.

This should be a good start (I hope 🙂 )
` for child in children:

put the Task into an execution queue

Task.enqueue(child, queue_name='my_queue_here')

wait for the task to finish

child.wait_for_status(status=['completed'])

reload all the metrics

child.reload()

get the metrics

print(child.get_last_scalar_metrics()) `

  
  
Posted 3 years ago

AgitatedDove14
Thanks, this works great!
Does this method expect my_train_func to be in the same file as Task.init() ? Child exp gets aborted immediately after starting with some strange exception in my case

  
  
Posted 3 years ago

Hi AgitatedDove14
This is exactly what i needed, thank you a lot!

One problem I have with this function is that it creates drafts, but i need it to execute them and return scalars. Is this possible?

thanks again

  
  
Posted 3 years ago
615 Views
5 Answers
3 years ago
one year ago
Tags