Answered

Hi! Can Someone Show Me An Example Of How

Hi!
Can someone show me an example of how PipelineController.create_draft works? I'm trying to create a template of a pipeline to run it later but I can't get it to work. This is my attempt:
Script for pipeline draft creation:` from clearml import PipelineController

Create and configure the pipeline controller.

pipe = PipelineController(
name="Iris-LogisticRegression Pipeline",
project="Mocks",
target_project="Mocks/Components",
version="0.0.1",
add_pipeline_tags=True,
abort_on_failure=False,
)

Attach some parameters to the pipeline.

pipe.add_parameter("mode", default="Train")

Populate the pipeline with ML steps.

pipe.add_step(
name="stage_collect_dataset",
base_task_project="Mocks",
base_task_name="Pipeline step 1: Collect/Fetch the data",
execution_queue="default",
continue_on_fail=False,
cache_executed_step=False,
time_limit=None,
)
pipe.add_step(
name="stage_preprocess_dataset",
parents=[
"stage_collect_dataset",
],
base_task_project="Mocks",
base_task_name="Pipeline step 2: Preprocess the data",
execution_queue="default",
continue_on_fail=False,
cache_executed_step=False,
time_limit=None,
parameter_override={
"General/dataset_url": "${stage_collect_dataset.artifacts.dataset.url}",
"General/test_size": 0.25,
},
)
pipe.add_step(
name="stage_train_model",
parents=[
"stage_preprocess_dataset",
],
base_task_project="Mocks",
base_task_name="Pipeline step 3: Train the model",
execution_queue="default",
continue_on_fail=False,
cache_executed_step=False,
time_limit=None,
parameter_override={
"General/dataset_task_id": "${stage_preprocess_dataset.id}",
},
)

pipe.create_draft()

print("Pipeline's draft successfully created!") Script to clone, enqueue and run the pipeline: import datetime

from clearml import Task

EXECUTION_QUEUE = "controllers"

Create the experiment/task and connect ClearML with the current process.

master_task = Task.init(
project_name="Mocks",
task_name="Pipeline runner",
task_type=Task.TaskTypes.application,
)

Get a reference to the pipeline task.

pipeline_task = Task.get_task(
project_name=master_task.get_project_name(),
task_name="Iris-LogisticRegression Pipeline",
)

Clone the pipeline task.

This creates a task with status Draft whose parameters can be modified.

run_date = datetime.datetime.now()
cloned_task = Task.clone(
source_task=pipeline_task, name=f"{pipeline_task.name} - {run_date:%Y%m%d:%H%M}"
)

Enqueue the configured pipeline task for execution.

Task.enqueue(cloned_task.id, queue_name=EXECUTION_QUEUE)

print(f"Pipeline {cloned_task.name!r} has been sent to queue {EXECUTION_QUEUE!r}") `The base tasks I use are the same as in this https://github.com/allegroai/clearml/tree/master/examples/pipeline .

The result is the pipeline task starts running, but again returns to the draft state, without actually executing any of the steps.

I should clarify this is not a problem with the pipeline itself, since when I run it with 'start' instead of 'create_draft', it works just fine. However, my intention is ONLY to create it to be executed later on.

I'm using clearml v1.1.6 and clearml-agent v1.1.2.

Has anyone ever encountered this problem? 😐

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GiganticTurtle0
				
					0
					 × 1

Votes Newest

Answers 14

Well, PipelineDecorator actually allows you to do the same thing, with the same ability that is clone / modify / enqueue.
(I mean, Pipeline with tasks is also great, I just want to clarify that they have the same capabilities in this respect).

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

. I was wondering what is the use of

PipelineController.create_draft

if you can't use it to clone and run tasks, as we have seen

I think the initial thought was to allow to create a pipeline from a pipeline programatically. Then once you have the "pipeline" you can manually enqueue it and modify it. Think a pipeline constructing other pipelines in flight based on some logic, then launching them in parallel.
make sense ?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hi AgitatedDove14 , so isn't it ClearML best practice to create a draft pipeline to have the task on the server so that it can be cloned, modified and executed at any time?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GiganticTurtle0
				
					0
					 × 1

I see the point. The reason I'm using PipelineController now is that I've realised that in the code I only send IDs from one step of the pipeline to another, and not artefacts as such. So I think it makes more sense in this case to work with the former.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GiganticTurtle0
				
					0
					 × 1

Sure! Thank you 🙂

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GiganticTurtle0
				
					0
					 × 1

BTW: I think an easy fix could be:
if running_remotely(): pipeline.start() else: pipeline.create_draft()

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Having the ability to clone and modify the same task over and over again, in principle I would no longer need the multi_instance support feature from PipelineDecorator.pipeline. Is this correct, or are they different things?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GiganticTurtle0
				
					0
					 × 1

So I think it makes more sense in this case to work with the former.

Totally !

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

ClearML best practice to create a draft pipeline to have the task on the server so that it can be cloned, modified and executed at any time?

Well it is, we just assume that you executed the pipeline somewhere (i.e. made sure it works) 🙂

Correction:
What you actually are looking for (and I will make sure we have it in the doc) is :
pipeline.start(queue=None)It will just leave it as is, so you can manually enqueue / clone it 🙂

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hi AgitatedDove14 , just one last thing before closing the thread. I was wondering what is the use of PipelineController.create_draft if you can't use it to clone and run tasks, as we have seen

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GiganticTurtle0
				
					0
					 × 1

I don't know if you remember the need I had some time ago to launch the same pipeline through configuration. I've been thinking about it and I think PipelineController fits my needs better than PipelineDecorator in that respect.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GiganticTurtle0
				
					0
					 × 1

can someone show me an example of how

PipelineController.create_draft

I think the idea is to store a draft versio of the pipeline (not the decorator type, I think, but the one launching pre-executed Tasks).
GiganticTurtle0 I'm not sure I fully understand how / why you are using it, can you expand?

EDIT:

However, my intention is ONLY to create it to be executed later on.

Hmm so may like enqueue it?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Exactly!! That's what I was looking for: create the pipeline but not launching it. Thanks again AgitatedDove14

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GiganticTurtle0
				
					0
					 × 1

Sure, thing, I'll fix the "create_draft" docstring to suggest it

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Write your answer

1K Views

14 Answers

2 years ago

one year ago