Hi @<1654294828365647872:profile|GorgeousShrimp11> , long story short - you can.
Now to delve into it a bit - You can trigger entire pipeline runs via the API.
I can think of two options from the top of my head. First being some sort of "service" task running constantly and listening to something and then triggering pipeline runs.
The second, some external source sending an POST request via API to trigger a pipeline.
What do you think?
why is pushing into the services queue required ...
The services queue is usually connected with an agent running in "services mode" which means this agent is executing multiple tasks in parallel (as opposed to regular agent that only launches one Task at a time, the assumption is that "service" Tasks are usually not heavy on cpu/ram so multiple instances make sense)
Thanks @<1523701070390366208:profile|CostlyOstrich36> , that does sound like an option. Can you point me to the documentation for this API call as I haven't been able to find anything?
Hi @<1523701205467926528:profile|AgitatedDove14> , thanks for your reply. We want to use the Scheduler
but to run a pipeline. Looking at this example here, it looks like it only works with tasks: None
So, in my code example above, where I have executing_pipeline
as the pipeline function created with the decorator, can this be scheduled to run with the TaskScheduler
, ie. used as the function in this line? None At the moment, we can't get this to work or figure out how to use the Scheduler with pipelines.
. Looking at this example here, it looks like it only works with tasks:
Aha! Pipeline is a Task 🙂 (a specific type of Task, nonetheless a Task)
Just use the pipeline ID, and make sure you push it into the services queue, voila
@<1523701205467926528:profile|AgitatedDove14> Hi! Could you give any feedback on the above? We are trying to figure out if/how we can run pipelines on a schedule and also trigger them with an external event.
Following up, I found this in the code on Github: None
A taskid
is required though - to get this id would we run the pipeline manually as it then shows up in the Web UI and then just use the id of the task that we can get from the UI by clicking on the pipeline run info?
Hi @<1654294828365647872:profile|GorgeousShrimp11>
can you run a pipeline on a
schedule
or are schedules only for Tasks?
I think one tiny details got lost here, Pipelines (the logic driving them) are a type of Task, this means you can clone and enqueue them like other tasta
(Task.enqueue / Task.clone)
Other than that looks good to me, did I miss anything ?
Hi @<1523701070390366208:profile|CostlyOstrich36> , another quick question, can you run a pipeline on a schedule
or are schedules only for Tasks? We are battling to figure out how to automate the pipelines.
@<1523701070390366208:profile|CostlyOstrich36> , a quick follow up, I've been looking at the ClearML API documentation to see how to trigger a pipeline via the API. Do you use queues
and add_task
, as specified here: None ?
Here is an example of the pipeline code, simplified:
"""Forecasting Pipeline"""
from clearml.automation.controller import PipelineDecorator
from clearml import TaskTypes
@PipelineDecorator.component(cache=True, task_type=TaskTypes.data_processing)
def project_pipeline(config_path: str):
"""
Pipeline steps
Args:
config_path (str): Path to config file
"""
from clearml_pipeline.modeling_utils import generate_predictions
from loguru import logger
try:
results = generate_predictions(config_path)
except Exception as e:
logger.error(f"{e}")
@PipelineDecorator.pipeline(
name="pipeline", project="project_name", version="0.0.1"
)
def executing_pipeline(config_path: str):
"""Decorator for executing the pipeline"""
project_pipeline(config_path)
if __name__ == "__main__":
PipelineDecorator.run_locally()
executing_pipeline("clearml_pipeline/config/ml_config.yaml")
Hi @<1523701205467926528:profile|AgitatedDove14> , I'm still having issues with this set up. See my latest comment here: None
I created a new queue megan-testing
and have an agent running on my machine that I assigned to it. It works when I just use a simple task and schedule it, but when I try run the pipeline, it says it can't find the queue.
I am using the pipeline id of when I last ran the pipeline and got this through the UI in ClearML.
oh the pipeline logic itself holds one "job" on the worker, and this is why you do not have any other spare workers to run the components of the pipeline.
Run your worker with --services-mode
, it will launch multiple Tasks at the same time, it should solve the issue
Just use the pipeline ID, and make sure you push it into the services queue, voila
@<1523701205467926528:profile|AgitatedDove14> A somewhat related question - why is pushing into the services queue required as opposed to just pushing it into other queues? I have had experience where triggering a pipeline would not show up under the Pipelines tab in the web UI - it just shows up in Projects. Wondering if the queue matters for this.