Does Anyone Have Any Examples Or Advice On How To Implement A Dag Like This In Clearml Pipelines? Say I Want To Do Crossvalidation (Or In My Case Backtesting On Different Horizons For A Forecasting Model) Where I Have Some Common Pieces And Also A Map/Red

Answered

does anyone have any examples or advice on how to implement a DAG like this in clearml pipelines? say I want to do crossvalidation (or in my case backtesting on different horizons for a forecasting model) where I have some common pieces and also a map/reduce like set of steps to run the same model in parallel on different datasets

  				
Posted 
	one year ago

					More  		
  Report
		
					LittleReindeer37
				
					0
					 × 1

Votes Newest

Answers 6

cool, thanks! the first one was what I had thought of but seemed unpythonic, so I'll give the second a shot

  				
Posted 
	one year ago

					More  		
  Report
		
					LittleReindeer37
				
					0
					 × 1

🤞

  				
Posted 
	one year ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I could just loop through and create separate pipelines with different parameters, but seems sort of inefficient. the hyperparameter optimization might actually work in this case utilizing grid search, but seems like kind of a hack

  				
Posted 
	one year ago

					More  		
  Report
		
					LittleReindeer37
				
					0
					 × 1

Hi LittleReindeer37
You mean something like login in the DAG? or just DAG?
None
None

  				
Posted 
	one year ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

those look like linear DAGs to me, but maybe I'm missing something. I'm thinking something like the map operator in Prefect where I can provide an array of ["A", "B", "C"] and run the steps outlined with dotted lines independently for each of those are arguments

  				
Posted 
	one year ago

					More  		
  Report
		
					LittleReindeer37
				
					0
					 × 1

I see, you can manually do that with add steps, i.e.

for elem in map:
  pipeline.add_step(..., elem)

or you can do that with full logic:


@PipelineDecorator.component(...)
def square_num(num):
    return num**2

@PipelineDecorator.pipeline(...)
def map_flow(nums):
    res = []
    # This will run in parallel
    for num in nums:
      res.append(square_num)
    # this is where we actually wait for the results
    for r in res:
        print_nums(r)

map_flow([1,2,3,5,8,13])

  				
Posted 
	one year ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Write your answer

2K Views

6 Answers

one year ago