Basically the idea is that you create the pipeline once (say debug), then once you see it is running, you have a Task of your pipeline in the system (with any custom logic you added). With a Task in the system you can always clone/modify and launch externally (i.e. from code/ui. Make sense ?
PipelineController creates another Task in the system, that you can later clone and enqueue to start a process (usually queuing it on the "services" queue)
AgitatedDove14 - where does automation.controller.PipelineController
fit in?
Ohh, then yes, you can use the https://github.com/allegroai/clearml/blob/bd110aed5e902efbc03fd4f0e576e40c860e0fb2/clearml/automation/monitor.py#L10 class to monitor changes in the dataset/project
now if dataset1 is updated, i want process to update dataset2
How can i make it such that any update to the upstream database
What do you mean "upstream database"?
Not able to understand whatโs really happening in the links
Trying to understand these, maybe playing around will help
My question is - I have this in a notebook now. How can i make it such that any update to the upstream database triggers this data transformation step
Essentially, if I have a dataset on which I am performing transformations and then creating other downstream datasets
It's a good abstraction for monitoring the state of the platform and call backs, if this is what you are after.
If you just need "simple" cron, then you can always just loop/sleep ๐
AgitatedDove14 - thanks for the quick reply. automation.Monitor
is the abstraction i could use?
Hi TrickySheep9
So basically the idea is you can quickly code a scheduler with your own logic, then launch is on the "services queue" to run basically forever ๐
This could be a good example:
https://github.com/allegroai/clearml/blob/master/examples/services/monitoring/slack_alerts.py
https://github.com/allegroai/clearml/blob/master/examples/automation/task_piping_example.py