now after 1st iteration is completed then after 5 minutes my script runs automatically and then again it logs into trains server
and it should log it into the same task and same project
but what is happening is it is creating new task under same project with same task name
You can do:task = Task.get_task(task_id='uuid_of_experiment')
task.get_logger().report_scalar(...)
Now the only question is who will create the initial Task, so that the others can report to it. Do you have like a "master" process ?
I will share my script u can see it what I am doing
Just call the Task.init before you create the subprocess, that's it 🙂 they will all automatically log to the same Task. You can also call the Task.init again from within the subprocess task, it will not create a new experiment but use the main process experiment.
See on line 212 I am calling one function "combined" with some arguments
and that function creates Task and log them
so , it will create a task when i will run it first time
then if there are 100 experiments how it will create 100 tasks?
It will not create another 100 tasks, they will all use the main Task. Think of it as they "inherit" it from the main process. If the main process never created a task (i.e. no call to Tasl.init) then they will create their own tasks (i.e. each one will create its own task and you will end up with 100 tasks)
so, if I call Task.init() before that line there is no need of calling Task.init() on line number 92
I have to create a main task for example named as main
then if there are 10 experiments then I have to call Task.create() for those 10 experiments
No. since you are using Pool. there is no need to call task init again. Just call it once before you create the Pool, then when you want to use it, just do task = Task.current_task()
def combined(path,exp_name,project_name):
temp = Task.create(task_name="exp_name")
logger = temp.current_logger()
logger.report_scalar()
def main():
task=Task.init(project_name="test")
[pool.apply_async(combined, args = (row['Path'], row['exp_name'], row['project_name'])) for index,row in temp_df.iterrows()]
scheduler = BlockingScheduler()
scheduler.add_job(main, 'interval', seconds=60, max_instances=3)
scheduler.start()
my scheduler will be running every 60 seconds and calling main function
main will initialize parent task and then my multiprocessing occurs which call combined function with parameters as project_name and exp_name
then my combined function create a sub task using Task.create(task_name=exp_name)
Just so I understand,
scheduler executes main every 60sec
main spins X sub-processes
Each subprocess needs to report scalars ?
each subprocess logs one experiment as task