Since I am still on time, I would like to report another minor bug related to the 'add_pipeline_tags' parameter of PipelineDecorator.pipeline
. It turns out when the pipeline consists of components that in turn use other components (via 'helper_functions'), these nested components are not tagged with 'pipe: <pipeline_task_id>'. I assume this should not be like that, right?
Well, instead of plain functions or files I use components because I need some of those steps to run on one machine and some on another. And it works perfectly fine (ignoring some minor bugs like this one). So I'm actually inserting component-decorated functions into 'helper_functions' parameter
BTW I would really appreciate it if you let me know when you get it fixed 🙏
Nested pipelines do not depend on each other. You can think of it as several models being trained or doing inference at the same time, but each one delivering results for a different client. So you don't use the output from one nested pipeline to feed another one running concurrently, if that's what you mean.
Hmm I think the approach in general would be to create two pipeline tasks, then launch them from a third pipeline or trigger externally? If on the other hand it makes sense to see both pipelines on the same execution graph, then the nested components makes a lot of sense. Wdyt?
GiganticTurtle0 your timing is great, the plan is to wrap-up efforts and release early next week (I'm assuming GitHub fixes will be pushed tomorrow I'll post here once they are there)
The thing is I don't know in advance how many models there will be in the inference stage. My approach is to read from a database the configurations of the operational models through a for loop, and in that loop all the inference tasks would be enqueued (one task for each deployed model). For this I need the system to be able to run several pipelines at the same time. As you told me for now this is not possible, as pipelines are based on singletons, my alternative is to use components
To sum up, we agree that it will be nice to enable the nested components tags. I will continue playing with the capabilities of nested components and keep reporting bugs as I come across them!
Can you think of any other way to launch multiple pipelines concurrently? Since we have already seen it is only possible to run a single Pipelinecontroller in a single Python process
They share the same code (i.e. the same decorated functions), but using a different configuration.
... these nested components are not tagged with 'pipe: <pipeline_task_id>'. I assume this should not be like that, right?
Helper functions are not "component", they are actually files that will be accessible when running the component itself.
am I missing something ?
Thanks GiganticTurtle0
So the bug is "mock_step" is storing "NUMBER_2" argument value in the second instance?
Oh right, I missed the fact the helper functions are also decorated, yes it makes sense we add the tags as well.
Regarding nested pipelines, I think my main question is , are they independent or are we generating everything from the same code base?
Building the pipeline in runtime from external configuration is very cool!!
I think nested components is exactly the correct solution, and it is a great use case.