Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, We Are Using Clearml For Our Experiment Tracking But Now Investigating Using The Pipeline Functionality As Well For Scheduling. We Also Want To Be Able To Trigger A Pipeline Run When There Is New Data In An External Database. Is This Possible? From Wh

Hi, we are using ClearML for our experiment tracking but now investigating using the pipeline functionality as well for scheduling. We also want to be able to trigger a pipeline run when there is new data in an external database. Is this possible? From what I can see the trigger works for changes in ClearML datasets for example: None Is there a way to trigger a pipeline based on some external change that we can monitor?

  
  
Posted 10 months ago
Votes Newest

Answers 14


Hi @<1523701205467926528:profile|AgitatedDove14> , thanks for your reply. We want to use the Scheduler but to run a pipeline. Looking at this example here, it looks like it only works with tasks: None

So, in my code example above, where I have executing_pipeline as the pipeline function created with the decorator, can this be scheduled to run with the TaskScheduler , ie. used as the function in this line? None At the moment, we can't get this to work or figure out how to use the Scheduler with pipelines.

  
  
Posted 10 months ago

Hi @<1654294828365647872:profile|GorgeousShrimp11> , long story short - you can.

Now to delve into it a bit - You can trigger entire pipeline runs via the API.

I can think of two options from the top of my head. First being some sort of "service" task running constantly and listening to something and then triggering pipeline runs.

The second, some external source sending an POST request via API to trigger a pipeline.

What do you think?

  
  
Posted 10 months ago

Following up, I found this in the code on Github: None

A taskid is required though - to get this id would we run the pipeline manually as it then shows up in the Web UI and then just use the id of the task that we can get from the UI by clicking on the pipeline run info?

  
  
Posted 10 months ago

Hi @<1654294828365647872:profile|GorgeousShrimp11>

can you run a pipeline on a

schedule

or are schedules only for Tasks?

I think one tiny details got lost here, Pipelines (the logic driving them) are a type of Task, this means you can clone and enqueue them like other tasta
(Task.enqueue / Task.clone)
Other than that looks good to me, did I miss anything ?

  
  
Posted 10 months ago

@<1523701205467926528:profile|AgitatedDove14> Hi! Could you give any feedback on the above? We are trying to figure out if/how we can run pipelines on a schedule and also trigger them with an external event.

  
  
Posted 10 months ago

I am using the pipeline id of when I last ran the pipeline and got this through the UI in ClearML.

  
  
Posted 10 months ago

@<1523701070390366208:profile|CostlyOstrich36> , a quick follow up, I've been looking at the ClearML API documentation to see how to trigger a pipeline via the API. Do you use queues and add_task , as specified here: None ?

Here is an example of the pipeline code, simplified:

"""Forecasting Pipeline"""

from clearml.automation.controller import PipelineDecorator
from clearml import TaskTypes

@PipelineDecorator.component(cache=True, task_type=TaskTypes.data_processing)
def project_pipeline(config_path: str):
    """
    Pipeline steps

    Args:
        config_path (str): Path to config file
    """

    from clearml_pipeline.modeling_utils import generate_predictions
    from loguru import logger

    try:
        results = generate_predictions(config_path)

    except Exception as e:
        logger.error(f"{e}")


@PipelineDecorator.pipeline(
    name="pipeline", project="project_name", version="0.0.1"
)
def executing_pipeline(config_path: str):
    """Decorator for executing the pipeline"""

    project_pipeline(config_path)


if __name__ == "__main__":

    PipelineDecorator.run_locally()

    executing_pipeline("clearml_pipeline/config/ml_config.yaml")
  
  
Posted 10 months ago

Just use the pipeline ID, and make sure you push it into the services queue, voila

@<1523701205467926528:profile|AgitatedDove14> A somewhat related question - why is pushing into the services queue required as opposed to just pushing it into other queues? I have had experience where triggering a pipeline would not show up under the Pipelines tab in the web UI - it just shows up in Projects. Wondering if the queue matters for this.

  
  
Posted 10 months ago

Hi @<1523701205467926528:profile|AgitatedDove14> , I'm still having issues with this set up. See my latest comment here: None

I created a new queue megan-testing and have an agent running on my machine that I assigned to it. It works when I just use a simple task and schedule it, but when I try run the pipeline, it says it can't find the queue.

  
  
Posted 10 months ago

oh the pipeline logic itself holds one "job" on the worker, and this is why you do not have any other spare workers to run the components of the pipeline.
Run your worker with --services-mode , it will launch multiple Tasks at the same time, it should solve the issue

  
  
Posted 10 months ago

Hi @<1523701070390366208:profile|CostlyOstrich36> , another quick question, can you run a pipeline on a schedule or are schedules only for Tasks? We are battling to figure out how to automate the pipelines.

  
  
Posted 10 months ago

Thanks @<1523701070390366208:profile|CostlyOstrich36> , that does sound like an option. Can you point me to the documentation for this API call as I haven't been able to find anything?

  
  
Posted 10 months ago

. Looking at this example here, it looks like it only works with tasks:

Aha! Pipeline is a Task 🙂 (a specific type of Task, nonetheless a Task)
Just use the pipeline ID, and make sure you push it into the services queue, voila

  
  
Posted 10 months ago

why is pushing into the services queue required ...

The services queue is usually connected with an agent running in "services mode" which means this agent is executing multiple tasks in parallel (as opposed to regular agent that only launches one Task at a time, the assumption is that "service" Tasks are usually not heavy on cpu/ram so multiple instances make sense)

  
  
Posted 10 months ago