Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi! Can Someone Show Me An Example Of How

Hi!
Can someone show me an example of how PipelineController.create_draft works? I'm trying to create a template of a pipeline to run it later but I can't get it to work. This is my attempt:
Script for pipeline draft creation:` from clearml import PipelineController

Create and configure the pipeline controller.

pipe = PipelineController(
name="Iris-LogisticRegression Pipeline",
project="Mocks",
target_project="Mocks/Components",
version="0.0.1",
add_pipeline_tags=True,
abort_on_failure=False,
)

Attach some parameters to the pipeline.

pipe.add_parameter("mode", default="Train")

Populate the pipeline with ML steps.

pipe.add_step(
name="stage_collect_dataset",
base_task_project="Mocks",
base_task_name="Pipeline step 1: Collect/Fetch the data",
execution_queue="default",
continue_on_fail=False,
cache_executed_step=False,
time_limit=None,
)
pipe.add_step(
name="stage_preprocess_dataset",
parents=[
"stage_collect_dataset",
],
base_task_project="Mocks",
base_task_name="Pipeline step 2: Preprocess the data",
execution_queue="default",
continue_on_fail=False,
cache_executed_step=False,
time_limit=None,
parameter_override={
"General/dataset_url": "${stage_collect_dataset.artifacts.dataset.url}",
"General/test_size": 0.25,
},
)
pipe.add_step(
name="stage_train_model",
parents=[
"stage_preprocess_dataset",
],
base_task_project="Mocks",
base_task_name="Pipeline step 3: Train the model",
execution_queue="default",
continue_on_fail=False,
cache_executed_step=False,
time_limit=None,
parameter_override={
"General/dataset_task_id": "${stage_preprocess_dataset.id}",
},
)

pipe.create_draft()

print("Pipeline's draft successfully created!") Script to clone, enqueue and run the pipeline: import datetime

from clearml import Task

EXECUTION_QUEUE = "controllers"

Create the experiment/task and connect ClearML with the current process.

master_task = Task.init(
project_name="Mocks",
task_name="Pipeline runner",
task_type=Task.TaskTypes.application,
)

Get a reference to the pipeline task.

pipeline_task = Task.get_task(
project_name=master_task.get_project_name(),
task_name="Iris-LogisticRegression Pipeline",
)

Clone the pipeline task.

This creates a task with status Draft whose parameters can be modified.

run_date = datetime.datetime.now()
cloned_task = Task.clone(
source_task=pipeline_task, name=f"{pipeline_task.name} - {run_date:%Y%m%d:%H%M}"
)

Enqueue the configured pipeline task for execution.

Task.enqueue(cloned_task.id, queue_name=EXECUTION_QUEUE)

print(f"Pipeline {cloned_task.name!r} has been sent to queue {EXECUTION_QUEUE!r}") `The base tasks I use are the same as in this https://github.com/allegroai/clearml/tree/master/examples/pipeline .

The result is the pipeline task starts running, but again returns to the draft state, without actually executing any of the steps.

I should clarify this is not a problem with the pipeline itself, since when I run it with 'start' instead of 'create_draft', it works just fine. However, my intention is ONLY to create it to be executed later on.

I'm using clearml v1.1.6 and clearml-agent v1.1.2.

Has anyone ever encountered this problem? ๐Ÿ˜

  
  
Posted 2 years ago
Votes Newest

Answers 14


I don't know if you remember the need I had some time ago to launch the same pipeline through configuration. I've been thinking about it and I think PipelineController fits my needs better than PipelineDecorator in that respect.

  
  
Posted 2 years ago

Sure! Thank you ๐Ÿ™‚

  
  
Posted 2 years ago

ClearML best practice to create a draft pipeline to have the task on the server so that it can be cloned, modified and executed at any time?

Well it is, we just assume that you executed the pipeline somewhere (i.e. made sure it works) ๐Ÿ™‚

Correction:
What you actually are looking for (and I will make sure we have it in the doc) is :
pipeline.start(queue=None)It will just leave it as is, so you can manually enqueue / clone it ๐Ÿ™‚

  
  
Posted 2 years ago

So I think it makes more sense in this case to work with the former.

Totally !

  
  
Posted 2 years ago

Hi AgitatedDove14 , just one last thing before closing the thread. I was wondering what is the use of PipelineController.create_draft if you can't use it to clone and run tasks, as we have seen

  
  
Posted 2 years ago

Well, PipelineDecorator actually allows you to do the same thing, with the same ability that is clone / modify / enqueue.
(I mean, Pipeline with tasks is also great, I just want to clarify that they have the same capabilities in this respect).

  
  
Posted 2 years ago

Sure, thing, I'll fix the "create_draft" docstring to suggest it

  
  
Posted 2 years ago

Having the ability to clone and modify the same task over and over again, in principle I would no longer need the multi_instance support feature from PipelineDecorator.pipeline. Is this correct, or are they different things?

  
  
Posted 2 years ago

Exactly!! That's what I was looking for: create the pipeline but not launching it. Thanks again AgitatedDove14

  
  
Posted 2 years ago

can someone show me an example of howย 

PipelineController.create_draft

I think the idea is to store a draft versio of the pipeline (not the decorator type, I think, but the one launching pre-executed Tasks).
GiganticTurtle0 I'm not sure I fully understand how / why you are using it, can you expand?

EDIT:

However, my intention is ONLY to create it to be executed later on.

Hmm so may like enqueue it?

  
  
Posted 2 years ago

I see the point. The reason I'm using PipelineController now is that I've realised that in the code I only send IDs from one step of the pipeline to another, and not artefacts as such. So I think it makes more sense in this case to work with the former.

  
  
Posted 2 years ago

. I was wondering what is the use ofย 

PipelineController.create_draft

ย if you can't use it to clone and run tasks, as we have seen

I think the initial thought was to allow to create a pipeline from a pipeline programatically. Then once you have the "pipeline" you can manually enqueue it and modify it. Think a pipeline constructing other pipelines in flight based on some logic, then launching them in parallel.
make sense ?

  
  
Posted 2 years ago

BTW: I think an easy fix could be:
if running_remotely(): pipeline.start() else: pipeline.create_draft()

  
  
Posted 2 years ago

Hi AgitatedDove14 , so isn't it ClearML best practice to create a draft pipeline to have the task on the server so that it can be cloned, modified and executed at any time?

  
  
Posted 2 years ago
852 Views
14 Answers
2 years ago
one year ago
Tags
Similar posts