Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hey Just Wanting To Know: What Is The Recommended Best Practice To Write Clearml Pipelines Between Controller And Decorators ?

Hey just wanting to know: what is the recommended best practice to write ClearML Pipelines between controller and decorators ?

  
  
Posted one year ago
Votes Newest

Answers 9


Hi FierceHamster54
I would take a look at the decorator example here
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py
Think of every function as a stand-alone task running on a different machine. The controller itself is the logic that creates the jobs and passes data, and the clearml agent / autoscaler does the actual orchestration

  
  
Posted one year ago

Sure but the same pattern can be achieved using explicitly the PipelineController class and defining steps using .add_step() pointing to CLearML's Task objects right ?

The decorators simply abstract away the controller but both methods (decorators or controller/tasks) allows to decouple your pipelines in steps each having an independent compute target, right ?

So basically choosing one method or the other only a question of best-practice or style ?

  
  
Posted one year ago

Ooooo okay I see the @PipelineDecorator.pipeline decorator you can have a function to orchestrate your components and manipulate their return data

  
  
Posted one year ago

  1. Yes the main diff between add task and decorator is basically creating dag and " executes " the tasks in parallel, based on the dag dependencies
  2. Decorator will also take care of serializing the data in / out of the function. Imagine the pipeline logic is running as python code where the logic will wait for the function to finish only when the result of the function is being used. This means that if you need a parllel loop you can create thread pool.
    Make sense
  
  
Posted one year ago

Btw AgitatedDove14 is there a way to define parallel tasks and use pipeline as an acyclic compute graph instead of simply sequential tasks ?

  
  
Posted one year ago

As opposed to the Controller/Task component where the add_step() only allows to sequentially execute them

  
  
Posted one year ago

So it seems decorator is simply the superior option?

Kind of yes 😊

In which case would we use add_task() option?

When you have existing Tasks, and the piping is very straight forward (i.e. input / output in the code is basically referencing other Tasks/artifacts, and there is no real need to do any magic for serializing/deserializing data between steps

  
  
Posted one year ago

Nice, that's a great feature! I'm also trying to have a component executing Giskard QA test suites on model and data, is there a planned feature when I can suspend execution of the pipeline, and display on the UI that this pipeline "steps" require a human confirmation to go on or stop while displaying arbitrary text/plot information ?

  
  
Posted one year ago

So it seems decorator is simply the superior option? In which case would we use add_task() option?

  
  
Posted one year ago
590 Views
9 Answers
one year ago
one year ago
Tags