AgitatedDove14
Thanks, this works great!
Does this method expect my_train_func
to be in the same file as Task.init()
? Child exp gets aborted immediately after starting with some strange exception in my case
Hi AgitatedDove14
This is exactly what i needed, thank you a lot!
One problem I have with this function is that it creates drafts, but i need it to execute them and return scalars. Is this possible?
thanks again
Hi UpsetCrocodile10
First, I perform many experiments in one process, ...
How about this one:
https://github.com/allegroai/trains/issues/230#issuecomment-723503146
Basically you could utilize create_function_task
This means you have Task.init() on the mainn "controller" and each "train_in_subset" as a "function_task". Them the controller can wait on them, and collect the data (like the HPO does.
Basically:
` controller_task = Task.init(...)
children = []
for i, s in enumerate(my_subset):
child = task.create_function_task(my_train_func, arguments=s, func_name='subset_{}'.format(i))
children.append(child)
for child in children:
child.reload()
print(child.get_last_scalar_metrics())
sleep(5.0) `What do you think?
UpsetCrocodile10
Does this method expect
my_train_func
to be in the same file as
As long as you import it and you can pass it, it should work.
Child exp get's aborted immediately ...
It seems it cannot find the file "main.py" , it assumes all code is part of a single repository, is that the case ? What do you have under the "Execution" tab for the experiment ?
Hi UpsetCrocodile10
execute them and return scalars.
This should be a good start (I hope 🙂 )
` for child in children:
put the Task into an execution queue
Task.enqueue(child, queue_name='my_queue_here')
wait for the task to finish
child.wait_for_status(status=['completed'])
reload all the metrics
child.reload()
get the metrics
print(child.get_last_scalar_metrics()) `