Hey All. Question Regarding Scheduling And Orchestration. Does Clearml Provide Any Tooling To Schedule Entire Training Pipelines And To Trigger Training Pipelines In Response To Events, E.G. Degraded Model Performance Alerting?

Answered

Hey all.

Question regarding scheduling and orchestration. Does ClearML provide any tooling to schedule entire training pipelines and to trigger training pipelines in response to events, e.g. degraded model performance alerting?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					TenseOstrich47
				
					0
					 × 1

Votes Newest

Answers 15

To report the metric to clearML, would that just be a batch update every t interval?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					TenseOstrich47
				
					0
					 × 1

Hi TenseOstrich47 Yup 🙂 You can check our scheduler module:
https://github.com/allegroai/clearml/tree/master/examples/scheduler
It supports time-events as well as triggers to external events

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AnxiousSeal95
				
					0
					 × 1

I can't figure out from the examples how the external trigger works. All of our model performance stats are in the DWH, and we want to build triggers based on that, Is that possible to integrate with Clearml triggers and schedulers?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					TenseOstrich47
				
					0
					 × 1

task that reads a message from a queue

Can you give a specific example?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

In particular, what does the external trigger poll? Is it a queue somewhere on clearml, or any arbitrary queue like SQS is supported?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					TenseOstrich47
				
					0
					 × 1

Yeah that could be one approach.

I mean, is it possible to create a trigger task that reads a message from a queue? And that message contains information about whether a pipeline needs to be triggered or not

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					TenseOstrich47
				
					0
					 × 1

Yep, just make sure you show some activity in a task once every 2 hours so it won't be detected as inactive 🙂

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

Can I use the task scheduler to schedule an update task every say 10 mins, would that keep it from being deleted?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					TenseOstrich47
				
					0
					 × 1

Sounds good

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

Say we have a DAG running on airflow every 30 mins. The purpose of this DAG is to aggregate results of model performance. If model performance is poor, then it sends a message to a queue with some config on which model to re-train.

I would like to use a TaskScheduler to poll this queue every X interval, to check whether a training pipeline needs to be kickstarted or not

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					TenseOstrich47
				
					0
					 × 1

Our model store consists of metadata stored in the DWH, and model artifacts stored in S3. We technically use ClearML for managing the hardware resource for running experiments, but have our own custom logging of metrics etc. Just wondering how tricky integrating a trigger would be for that

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					TenseOstrich47
				
					0
					 × 1

Hi TenseOstrich47 ,

Yes 🙂
Please take a look here:
https://github.com/allegroai/clearml/blob/master/examples/scheduler/trigger_example.py
https://github.com/allegroai/clearml/blob/master/examples/scheduler/cron_example.py

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

but have our own custom logging of metrics etc.

Are those custom metrics reported to the ClearML server or stored somewhere else?

Just wondering how tricky integrating a trigger would be for that

I guess it really depends on your current implementation currently

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

TenseOstrich47 , you could create a monitor task that reads model performance from your database and reports them as some scalar. According to that scalar you can create triggers 🙂

What do you think?

external trigger

What do you mean? Do you have a reference?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

Hi TenseOstrich47 What you can do is report the metric to clearml, then use the Taskscheduler to listen on a specific project. If a task in this project reports a metric below \ above a certain TH (Or I think if it's the highest \ lowest as well) you can trigger an event (Task \ function). That's how you do it with the Taskscheduler object

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AnxiousSeal95
				
					0
					 × 1

Write your answer

2K Views

15 Answers

3 years ago

2 years ago