From what we have seen, In order to submit the pipeline to ClearML.
Clearml processes the pipeline script locally and submits it to the queue. What happens locally is that is creates a controller task (task that orchestrates the pipeline I guess) and records the arguments of the script that needs to execute as part of this task i.e the pipeline script
Now once it is submitted to the queue, a new worker pod is spin up that continues this controller task (that was created in ClearML when the script was processes locally on my machine) and processes the entire script again with the same captures argument.
Is there a particular reason for it to process it twice, once locally on my machine and another remotely in the worker pod. When it processes locally, it can identify the pipeline and the steps so why does it need to do it remotely. Is this purely for orchestration reasons?
Hi EmbarrassedBlackbird33 , I'm not sure I understand. You don't need to run the script twice. You simply need to create the pipeline via code once and then you can run it as much as you want remotely