![Profile picture](https://clearml-web-assets.s3.amazonaws.com/scoold/avatars/ArrogantBlackbird16.png)
Reputation
Badges 1
13 × Eureka!Hi TimelyPenguin76 and SuccessfulKoala55 ,
My tasks are created by first creating many sub-processes, and then in each sub-process: initializing a task, connecting the task to some parameters, cloning the task, enqueueing the cloned task, then killing the sub-process. When I do this with just a single sub-process, everything seems to work fine. When there are many sub-processes, I get the error message ocassionally.
Yes, I use a locally hosted server (SAIPS team).
Thanks for your help and quick replies.
To create each subprocess, I use the following:
import subprocess from copy import copy new_env = copy(os.environ) new_env.pop('TRAINS_PROC_MASTER_ID', None) new_env.pop('TRAINS_TASK_ID', None) new_env.pop('CLEARML_PROC_MASTER_ID', None) new_env.pop('CLEARML_TASK_ID', None) subprocess.Popen(cmd, env=new_env, shell=True)
Where cmd is something like "python file.py <parameters>"
Perhaps this somehow disrupts clearml operation in the sub processes?
TimelyPenguin76 SuccessfulKoala55
Do you have any idea what may cause this?
Is it possible that different tasks created together somehow have the same identifier?
Or am I missing something obvious?
I believe there is a single agent, single queue, for all tasks.
Hi TimelyPenguin76 ,
Making such a toy example will take a lot of effort.
For now I intend to debug it or circumvent the error with various tricks.
If it is possible to explain the cause of the error message above, or some details regarding it, I would very much appreciate it.
TimelyPenguin76 Thanks for the reply.
I believe the way I start tasks is completely independent to this problem. Assuming my approach is in principle legitimate, it does not explain why I get the following error message. Note that the error only happens when I start multiple tasks. What is the cause of this error?clearml_agent: ERROR: Instance with the same WORKER_ID [algo-lambda:gpu0] is already running
Hi AgitatedDove14 ,
Continuing from the previous question: Is it possible to detect remote Task execution before the remote Task.init(...) function call?
For example, when I run this:
` print("Doing some computations that MUST be local") # I want to prevent this from running remotely
task = Task.init("OMD", task_name="bla")
task.set_base_docker("/home/rdekel/anaconda3/envs/P1")
cloned_task = Task.clone(source_task=task, name="Clone")
Task.enqueue(cloned_task.id, queue_name="ron_lambda_cp...
SuccessfulKoala55 re-attached the log.