Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
Could I Get Some Feedback From People With Experience Using Clearml Pipelines On The Best Way To Handle Caching? My Team Is Working On Configuring Clearml Pipelines For Our Data Processing Workflow. We Currently Have An Experimental Pipeline Configured F


It sounds like you understand the limitations correctly.

As far as I know, it'd be up to you to write your own code that computes the delta between old and new and only re-process the new entries.

The API would let you search through prior experimental results.

so you could load up the prior task, check the ids that showed up in output (maybe you save these as a separate artifact for faster load times), and only evaluate the new inputs. perhaps you copy over the old outputs to the new task for completeness.

that's how I'd approach it. use "data-creation" tasks and artifacts to roll your own logic for "caching" (skipping evaluation) within the task itself.

In the open source version, you don't get a whole lot (in my opinion) from using datasets over basic artifacts in tasks (scoped to just create a dataset). The real "power" in the datasets feature I believe come with some of the pro features.

  
  
Posted 3 months ago
53 Views
0 Answers
3 months ago
3 months ago