I want to have a CI/CD pipeline that, upon Engineer A commit, ensures that the pipeline is re-deployed such that with Engineer B uses it as template, it’s definitely the latest version of the code and process
Hi there,
This is exactly I want to do.
RoughTiger69
Have you be able to do it?
AgitatedDove14
How do you recommend to perform this task?
I mean, have a CI/CD (e.g Github Actions) thats update my “production” pipeline on ClearML UI, so a Data Scientist can start to experiment things and create jobs from the UI.
The training pipeline that is considered “best of breed” is committed to Git and deployed by CI/CD; tagged in ClearML clearly.
Users of this pipeline know it’s the “official” training flow that they can now play with using configuration.
Goal is to ensure that “official” pipelines are source controlled.
makes sense?
So “The” pipeline Engineer A creates, once updated with the latest code, and perhaps ran once as test by CI CD, should be “tainted” as “The production” version of that pipeline, so that Engineer B’s code always uses the latest released pipeline code
However I see I should really have made my question clearer.
My workflow is as follows:
Engineer A develops a pipeline with a number of steps. She experiments with this pipeline until she is happy with the flow and her code
Engineer B is in charge of running Engineer A’s pipeline with different parameters and investigate the results
nifty trick ! replacing the git metadata inside the task and the rest happens automatically!
There are many ways to do so, this is an example for github action: https://github.com/allegroai/trains-actions-train-model
AgitatedDove14 , thanks for the quick answer.
I think this is the easiest way, basically the CI/CD launches a pipeline (which under the hood is another type of Task), by querying the latest “Published” pipeline that is also Not archived, then cloning+pushing it to execution queue
Do you have an example?
UI when you want to “upgrade” the production pipeline you just right click “Publish” on the pipeline
I’ve did saw this “publish” option for pipelines, just for models, is this a new feature?
I’ve did saw this “publish” option for pipelines, just for models, is this a new feature?
Kind of hidden in the UI (not sure if on purpose), but if you click on the pipeline then go to details, in the new tab (of the pipeline Task) you can publish the Task (aka the pipeline)
In this example:
https://github.com/allegroai/clearml-actions-train-model/blob/7f47f16b438a4b05b91537f88e8813182f39f1fe/train_model.py#L14
replace with something like:
` task = Task.get_tasks(project_name="pipeline/project/.pipelines", {'status': ['published'], 'order_by': ["-created"], 'type': ['controller']})
new_pipeline = task.clone()
Task.enqueue(new_pipeline, queue_name="services")
should we wait for the pipeline?
new_pipeline.wait_for_status() `
have a CI/CD (e.g Github Actions) thats update my “production” pipeline on ClearML UI,
I think this is the easiest way, basically the CI/CD launches a pipeline (which under the hood is another type of Task), by querying the latest "Published" pipeline that is also Not archived, then cloning+pushing it to execution queue.
In the UI when you want to "upgrade" the production pipeline you just right click "Publish" on the pipeline you want to launch. Another way is to do the same with Tags instead of "Published" state.
IrritableGiraffe81 wdyt?
Not sure I’m getting the all system but for:
I want to have a CI/CD pipeline that, upon Engineer A commit, ensures that the pipeline is re-deployed such that with Engineer B uses it as template, it’s definitely the latest version of the code and process
You can configure your task to take the latest from a branch, so on each commit you are updated.
IrritableGiraffe81 AgitatedDove14 there are multiple levels of what the CI/CD should automate/validate.
This one is the minimal option.
Another option is:
CI deploys (executes) the pipeline fresh, from the committed code http://2.CI waits and extracts the results (various artifacts, metrics etc.) CI compares them to the latest (published) pipeline or to absolute numbers CI decides if to publish it or not (or at least tag it as RC.Steps 2-4 can be themselves encapsulated in a clearml task that accepts two pipeline runs and does the comparison/tagging/publishing
The training pipeline that is considered “best of breed” is committed to Git and deployed by CI/CD; tagged in ClearML clearly.
tagged in ClearML clearly -> this means you have a task in the UI ready for use after this step?
I suppose that yes; and I want this task to be labeled as such that it’s clear it’s the “production” task.