How can i make it such that any update to the upstream database
What do you mean "upstream database"?
Hi TrickySheep9
So basically the idea is you can quickly code a scheduler with your own logic, then launch is on the "services queue" to run basically forever ๐
This could be a good example:
https://github.com/allegroai/clearml/blob/master/examples/services/monitoring/slack_alerts.py
https://github.com/allegroai/clearml/blob/master/examples/automation/task_piping_example.py
Essentially, if I have a dataset on which I am performing transformations and then creating other downstream datasets
PipelineController creates another Task in the system, that you can later clone and enqueue to start a process (usually queuing it on the "services" queue)
AgitatedDove14 - thanks for the quick reply. automation.Monitor
is the abstraction i could use?
now if dataset1 is updated, i want process to update dataset2
It's a good abstraction for monitoring the state of the platform and call backs, if this is what you are after.
If you just need "simple" cron, then you can always just loop/sleep ๐
Not able to understand whatโs really happening in the links
Trying to understand these, maybe playing around will help
Ohh, then yes, you can use the https://github.com/allegroai/clearml/blob/bd110aed5e902efbc03fd4f0e576e40c860e0fb2/clearml/automation/monitor.py#L10 class to monitor changes in the dataset/project
My question is - I have this in a notebook now. How can i make it such that any update to the upstream database triggers this data transformation step
AgitatedDove14 - where does automation.controller.PipelineController
fit in?
Basically the idea is that you create the pipeline once (say debug), then once you see it is running, you have a Task of your pipeline in the system (with any custom logic you added). With a Task in the system you can always clone/modify and launch externally (i.e. from code/ui. Make sense ?