Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
Hi, I Have A Few Questions Regards To


Great discussion, I agree with you both. For me, we are not using clearml-data, so I am a bit curious how does a "published experiment" locked everything (including input? I assume someone can still just go inside the S3 bucket and delete the file without Clearml noticing).

From my experience, absolute reproducibility is code + data + parameter + execution sequence. For example, random seed or some parallelism can cause different result and could be tricky to deal with sometimes. We did build an internal system to ensure reproducibility. ClearML is experiment tracking component, then we integrate with Kedro for pipeline + parameters + data, so everything is tracked automatically.

I have been thinking to replace the data tracking component, our solution works fine but it is not the most efficient one. With GBs size of artifacts generated in every experiment, we have increasing need to do housekeeping regularly. Thus I am studying what's the best way to do so. "Tag" and "publish experiment" is what we are considering.

  
  
Posted 3 years ago
155 Views
0 Answers
3 years ago
one year ago