Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Does Clearml Have A Good Story For Offline/Batch Inference In Production? I Worked In The Airflow World For 2 Years And These Are The General Features We Used To Accomplish This. Are These Possible With Clearml?

Does clearML have a good story for offline/batch inference in production? I worked in the Airflow world for 2 years and these are the general features we used to accomplish this. Are these possible with ClearML?

Triggering: We'd want to be able to trigger a batch inference:

  • (rarely) on a schedule
  • (often) via a trigger in an event-based system, like maybe from AWS lambda function
    Parameters: We'd want to be able to pass parameters when we trigger the job, such as a start_date and end_date that the batch job can use to query the feature store to get the data to run inference on.

Retries/Alerts: retry a failed job a few times. Alert if fails all times.

Metrics/Alerts:

  • track how long each task and pipeline runs.
  • alert if a pipeline was supposed to run, but never did
    Backfilling : every day, we might inference on a day's worth of data. But for new pipelines, we'll need to run inference jobs on all of the data that existed before we created this pipeline.
  
  
Posted one year ago
Votes Newest

Answers 2


Hi @<1541954607595393024:profile|BattyCrocodile47>

Does clearML have a good story for offline/batch inference in production?

Not sure I follow, you mean like a case study ?

Triggering:

We'd want to be able to trigger a batch inference:

  • (rarely) on a schedule
  • (often) via a trigger in an event-based system, like maybe from AWS lambda function(2) Yes there is a great API for that, checkout the github actions it is essentially the same idea (RestAPI also available) None

Parameters:

We'd want to be able to pass parameters when we trigger the job, such as a

start_date

and

end_date

that the batch job can use to query the feature store to get the data to run inference on.

Also available, see the manual remote execution example here: None

Retries/Alerts:

retry a failed job a few times. Alert if fails all times.

Of course 🙂 I would check the Slack alert as a good reference for that: None

Metrics/Alerts:

  • track how long each task and pipeline runs.
  • alert if a pipeline was supposed to run, but never didSure thing, I would check the cleanup service as it queries Tasks, can pull execution time and other metrics. Notice that at the end pipelines are also Tasks (of a certain type), so the same way you query a Task one would query a pipeline: None
  
  
Posted one year ago

This is totally what I was looking for! Yeah, by "good story for offline batch" I meant, "good feature support for ..."

I bookmarked this comment. I think I'll be doing a POC trying to show this functionality within the next month.

  
  
Posted one year ago
1K Views
2 Answers
one year ago
one year ago
Tags