Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
We Are Planning To Use A Data Versioning System, Because Now We Are Having A Lot Of Folders With Different Names Which Basically Contain The Same Data, Only With Small Changes. The Most Prominent Candidates Are Clearml Data And Dvc. Could You Tell Me What

We are planning to use a data versioning system, because now we are having a lot of folders with different names which basically contain the same data, only with small changes. The most prominent candidates are ClearML Data and DVC. Could you tell me what the differences are and why we should use ClearML data?

Posted 2 years ago
Votes Newest

Answers 2

GreasyPenguin14 Hi!
I wish I could help but I'm afraid I'll need to ask AnxiousSeal95 for some help with that, please hold tight until he will be able to help out 🙂

Posted 2 years ago

Hi GreasyPenguin14

Could you tell me what the differences are and why we should use ClearML data?

The first difference is in the approach itself, DVC ties the data with the code (i.e. git repo), where we (ClearML - but not just us) actually think data should be abstracted from the Code-Base and become a standalone argument, allowing users to build/execute against different dataset/versions. ClearML Data becomes part of the workflow as it is visible from the UI including the ability to create structures in projects/sub-projects, naming conversions tags etc. (In the upcoming versions we will be extending the UI visualization capabilities for even better visibility) ClearML data offers full programmatic interface, allowing you to easily build automation processes, directly from code Triggers now support launching Tasks based on new datasets created/tagged in the system (e.g. automation is built in) Users can customize Datasets and add metrics / visualization from code, for increased visibility (e.g. plot the first few lines of a table, upload image samples etc.)
I probably missed a few points, but this is probably a good start 🙂

Posted 2 years ago
2 Answers
2 years ago
one year ago
Similar posts