Hi @<1523701205467926528:profile|AgitatedDove14> , thanks for getting back to me!
What do you mean by " the pipeline page doesn't show anything at all."? are you running the pipeline ? how ?
This (see attached screenshot below) is the pipeline page for "big_pipe" specified in the snippet above. I think I understand the issue though - without PipelineDecorator.component
being top level, the SDK is unable to see each of the nodes?
Basically a Pipeline is a Task (of a specific Type), so you can have pipeline A function clone/enqueue the pipelineB Task, and wait until it is done. wdyt?
That's exactly what I'm trying to do but perhaps in the wrong way. In the above snippet for example, I was trying to initialise both pipelines from decorators in the same script and use the output of pipeline B within pipeline A as if it were your standard python function.
I've just had a quick go at defining pipeline B from the pipeline A task ( PipelineController.add_step
) and the result is exactly what we're after! For ease of use, it would be nicer if we were able to treat pipeline A as a function in pipeline B's code with return values but I suppose it can only be used as a task with PipelineController.add_step
?
the SDK is unable to see each of the nodes?
Exactly ! I mean I love the idea of "nested" component, but implementation wise this is not trivial, it will also hurt the ability of caching individual component. The workaround is to have all the "business logic" in the pipeline function itself, routing data between components is basically "free". The data does not actually go through the pipeline logic, it only passes reference (unless the pipeline logic actually tries to access the data object, then it will be downloaded). Make sense ?
That's exactly what I'm trying to do but perhaps in the wrong way. In the above snippet for example, I was trying to initialise both....
So in order to do that you have to have individual Pipeline B. i.e. an actual stand alone pipeline.
Then you can use the pipeline Task ID and clone / enqueue like with any other Task. Which means the pipeline logic will do something like:
pipeline_task = Task.clone(source_task="pipeline_b_task_id")
Task.enqueue(task=pipeline_task, queue_name="services")
# wait until completed
pipeline_task.wait_for_status()
# make sure we have all the latest data
pipeline_task.reload()
# do something
Hi @<1534706830800850944:profile|ZealousCoyote89>
We'd like to have pipeline A trigger pipeline B
Basically a Pipeline is a Task (of a specific Type), so you can have pipeline A function clone/enqueue the pipelineB Task, and wait until it is done. wdyt?
Excellent, that makes complete sense. I thought we'd be restricted to creating pipeline A via the PipelineController
but I guess we could still use PipelineDecorator
with something along the lines of your example. Probably not much need for nested components this way! Still learning the clearml way of doing things but this is a massive help, thank you so much!
This snippet works as expected in terms of computing the results and using caching where specified but the pipeline page doesn't show anything at all.
Moving append_string
anywhere else in the script results in the following error:
File "/home/user/miniconda3/envs/clearml/lib/python3.10/site-packages/clearml/automation/controller.py", line 3544, in wrapper
_node = cls._singleton._nodes[_node_name].copy()
KeyError: 'append_string'
Using None & clearml==1.9.1
using caching where specified but the pipeline page doesn't show anything at all.
What do you mean by " the pipeline page doesn't show anything at all."? are you running the pipeline ? how ?
Notice PipelineDecorator.component needs to be Top level not nested inside the pipeline logic, like in the original example
@PipelineDecorator.component(
cache=True,
name=f'append_string_{x}',
)