Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
Hi, I Am Trying To Understand Clearml-Data And Only Found This Piece Of Article Explaining It.


Hi SubstantialElk6

but in terms of data provenance, its not clear how i can associate the data versions with the processes that created it.

I think DeliciousBluewhale87 ’s approach is what we are aiming for, but with code.
So using clearml-data from CLI is basically storing/versioning of files (with differentiable based storage etc, but still).
What ou are after (I think) is in your preprocessing code using the programtic Dataset class, to create the Dataset from code, this allows you to both have the storage capabilities and versioning, but also to couple it with the preprocessing code for provenance and automation.
The base assumption is that Dataset is always a Task (with artifacts and fancy interface), but a Task nonetheless, and this gives you all the capabilities of a Task, such as adding metrics/stats on the Data, automation with pipeline etc, but also the ability to later retrieve the data with simple CLI or code.
wdyt?

  
  
Posted 3 years ago
107 Views
0 Answers
3 years ago
one year ago
Tags