In particular, what does the external trigger poll? Is it a queue somewhere on clearml, or any arbitrary queue like SQS is supported?
Hi TenseOstrich47 ,
Yes 🙂
Please take a look here:
https://github.com/allegroai/clearml/blob/master/examples/scheduler/trigger_example.py
https://github.com/allegroai/clearml/blob/master/examples/scheduler/cron_example.py
I can't figure out from the examples how the external trigger works. All of our model performance stats are in the DWH, and we want to build triggers based on that, Is that possible to integrate with Clearml triggers and schedulers?
Yeah that could be one approach.
I mean, is it possible to create a trigger task that reads a message from a queue? And that message contains information about whether a pipeline needs to be triggered or not
task that reads a message from a queue
Can you give a specific example?
Our model store consists of metadata stored in the DWH, and model artifacts stored in S3. We technically use ClearML for managing the hardware resource for running experiments, but have our own custom logging of metrics etc. Just wondering how tricky integrating a trigger would be for that
Can I use the task scheduler to schedule an update task every say 10 mins, would that keep it from being deleted?
TenseOstrich47 , you could create a monitor task that reads model performance from your database and reports them as some scalar. According to that scalar you can create triggers 🙂
What do you think?
external trigger
What do you mean? Do you have a reference?
Hi TenseOstrich47 Yup 🙂 You can check our scheduler module:
https://github.com/allegroai/clearml/tree/master/examples/scheduler
It supports time-events as well as triggers to external events
Say we have a DAG running on airflow every 30 mins. The purpose of this DAG is to aggregate results of model performance. If model performance is poor, then it sends a message to a queue with some config on which model to re-train.
I would like to use a TaskScheduler to poll this queue every X interval, to check whether a training pipeline needs to be kickstarted or not
Hi TenseOstrich47 What you can do is report the metric to clearml, then use the Taskscheduler to listen on a specific project. If a task in this project reports a metric below \ above a certain TH (Or I think if it's the highest \ lowest as well) you can trigger an event (Task \ function). That's how you do it with the Taskscheduler object
Yep, just make sure you show some activity in a task once every 2 hours so it won't be detected as inactive 🙂
To report the metric to clearML, would that just be a batch update every t interval?
but have our own custom logging of metrics etc.
Are those custom metrics reported to the ClearML server or stored somewhere else?
Just wondering how tricky integrating a trigger would be for that
I guess it really depends on your current implementation currently