one can containerise the whole pipeline and run it pretty much anywhere.
Does that mean the entire pipeline will be running on the instance spinning the container ?
From here: this is what I understand:
https://kedro.readthedocs.io/en/stable/10_deployment/06_kubeflow.html
My thinking was I can use one command and run all steps locally while still registering all "nodes/functions/inputs/outputs etc" with clearml such that I could also then later go into the interface and clone any of the individual steps and run them again.
That is absolutely correct 🙂
With another command I could also just pseudo run the pipeline with kedro locally to register everything in clearml and then run it on a clearml agent.
Sure this will work
I thought that in both cases I would need to create a PipelineController Task at the end with the full pipeline included I could even just clone that one.
This is exactly how the pipeline is designed, and cloning and running the pipeline controller should work and launch the entire pipeline (usually the controller is executed on the services queue, and the pipeline Tasks are launched on a GPU or a CPU queue)
If I want to use a hook system (e.g. kedro provides hooks for running callbacks before and after nodes/tasks)
Yes I'm with you I think this is the main challenge here.
Is there a way to use Task as a decorator on a function level?
Yep, this is exactly what's coming in the next release of Pipelines (RC should be out in a week or so)