I Saw Some Talk Of Clearml + Kedro On Reddit. Is That A Good Approach?

Unanswered

AgitatedDove14

That's definitely very easy, I'm still not sure how Kedro scales on clusters. From what I saw, and I might have missed it, it seems more like a single instance with sub-processes, but no real ability to setup diff environment for the diff steps in the pipeline, is this correct ?

sub-processes is an option but it supports much more: https://kedro.readthedocs.io/en/stable/10_deployment/01_deployment_guide.html one can containerise the whole pipeline and run it pretty much anywhere. So I don't think the view of single instance is up-to-date

This actually ties well with the next version of pipelines we are working on. Basically like kubeflow add a decorator to a function making the fucntion a step in the pipeline (and a Task in ClearML).
My thinking was somehow separate short/simple steps (i.e. functions), from complicated steps (e.g. training with specific requirements).
Maybe Kedro can launch the "simple steps"? what do you think?

I might be misunderstanding things. My thinking was I can use one command and run all steps locally while still registering all "nodes/functions/inputs/outputs etc" with clearml such that I could also then later go into the interface and clone any of the individual steps and run them again. Completely independent of simple or hard steps. With another command I could also just pseudo run the pipeline with kedro locally to register everything in clearml and then run it on a clearml agent. I thought that in both cases I would need to create a PipelineController Task at the end with the full pipeline included I could even just clone that one. The latter is not working yet while the former (individual tasks) is already working except some python environment issues.

The other challenge I have come across is that using Task.init really just works if it is run in the script file itself right. If I want to use a hook system (e.g. kedro provides hooks for running callbacks before and after nodes/tasks) I can create new tasks but as the "Task.init()" is not technically run in the script that contains the source code the tracking is really challenging. Is there a way to use Task as a decorator on a function level?

All that said I might be going too deep in how I want to integrate the two frameworks in ways that is beyond the scope....

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					JealousParrot68
				
					0
					 × 1

181 Views

0 Answers

3 years ago

one year ago