Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hey Just Wanting To Know: What Is The Recommended Best Practice To Write Clearml Pipelines Between Controller And Decorators ?

Hey just wanting to know: what is the recommended best practice to write ClearML Pipelines between controller and decorators ?

  
  
Posted 2 years ago
Votes Newest

Answers 9


Hi FierceHamster54
I would take a look at the decorator example here
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py
Think of every function as a stand-alone task running on a different machine. The controller itself is the logic that creates the jobs and passes data, and the clearml agent / autoscaler does the actual orchestration

  
  
Posted 2 years ago

Sure but the same pattern can be achieved using explicitly the PipelineController class and defining steps using .add_step() pointing to CLearML's Task objects right ?

The decorators simply abstract away the controller but both methods (decorators or controller/tasks) allows to decouple your pipelines in steps each having an independent compute target, right ?

So basically choosing one method or the other only a question of best-practice or style ?

  
  
Posted 2 years ago

Ooooo okay I see the @PipelineDecorator.pipeline decorator you can have a function to orchestrate your components and manipulate their return data

  
  
Posted 2 years ago

As opposed to the Controller/Task component where the add_step() only allows to sequentially execute them

  
  
Posted 2 years ago

Btw AgitatedDove14 is there a way to define parallel tasks and use pipeline as an acyclic compute graph instead of simply sequential tasks ?

  
  
Posted 2 years ago

  1. Yes the main diff between add task and decorator is basically creating dag and " executes " the tasks in parallel, based on the dag dependencies
  2. Decorator will also take care of serializing the data in / out of the function. Imagine the pipeline logic is running as python code where the logic will wait for the function to finish only when the result of the function is being used. This means that if you need a parllel loop you can create thread pool.
    Make sense
  
  
Posted 2 years ago

Nice, that's a great feature! I'm also trying to have a component executing Giskard QA test suites on model and data, is there a planned feature when I can suspend execution of the pipeline, and display on the UI that this pipeline "steps" require a human confirmation to go on or stop while displaying arbitrary text/plot information ?

  
  
Posted 2 years ago

So it seems decorator is simply the superior option? In which case would we use add_task() option?

  
  
Posted 2 years ago

So it seems decorator is simply the superior option?

Kind of yes 😊

In which case would we use add_task() option?

When you have existing Tasks, and the piping is very straight forward (i.e. input / output in the code is basically referencing other Tasks/artifacts, and there is no real need to do any magic for serializing/deserializing data between steps

  
  
Posted 2 years ago
977 Views
9 Answers
2 years ago
one year ago
Tags