You mean to design the entire pipeline from YAML?
(this assumes your Tasks know how to process links to artifacts)
Is this what you are after?
(BTW: any reason for working with YAML files instead of coding it?)
In regards to the YAML how would you pass data? Like the pipeline from tasks example?
Yes. I Mean this. Like You have YAML Manifest and command like in yaml: python -m clearml_pipeline/@PipelineDecorator.component(return_values=['data_frame'], cache=True, task_type=TaskTypes.data_processing) def step_one(pickle_data_url: str, extra: int = 43): print('step_one') # make sure we have scikit-learn for this step, we need it to use to unpickle the object import sklearn # noqa import pickle import pandas as pd from clearml import StorageManager local_iris_pkl = StorageManager.get_local_copy(remote_url=pickle_data_url) with open(local_iris_pkl, 'rb') as f: iris = pickle.load(f) data_frame = pd.DataFrame(iris['data'], columns=iris['feature_names']) data_frame.columns += ['target'] data_frame['target'] = iris['target'] return data_frame
I don't like imports inside function
The imports inside the functions are because the function itself becomes a stand-alone job running on a remote machine, not the entire pipeline code. This also automatically picks packages to be installed on the remote machine. Make sense?