Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
So I Bumped Onto This Comparison Shared By Dagshub. It Kinda Placed Clearml Is A Rather Bad Position Compared To Everything Else In The Industry.


Hey AgitatedDove14 , thanks for the feedback, I appreciate you taking the time to explain your position on all these points. Will do my best to address your feedback:
Open Source License: I agree about the license and that others are using it. And note we didn’t write that you are not open source (as opposed to say wandb, or comet). The purpose of the table and post is to aid people making decisions about which tools to use, and using an SSPL license may be a significant consideration for some, and so we thought it was important to point this out clearly. Platform & Language Agnostic: I agree with your statement about TB and MLflow, and that’s why we also didn’t give them a green checkmark here. All the tools have a theoretical API, but if they don’t make it easy to use from any programming language out of the box, like DVC does, it doesn’t meet the criteria we intended. Experiment Data Storage: I re-read this, and I think we chose the name of this criteria poorly. If we treat storage as “where is the raw experiment data saved” then all cloud solutions are local as well (they all have some folder where raw data is saved, but it’s not really parsable in a meaningful way) and the criteria kind of loses its point. What we meant is where can I view it in a way that I can make sense of my data. We will probably rename it to something like “Simple accessible data”. Easy-to-setup: Let me start by saying I agree with the point about KF being significantly harder to install, and we indeed gave it the worst score on this front. What we had in mind when comparing this, is for example MLflow. Where installing it sums up to pip install mlflow and running it is mlflow ui . I agree that it is really important for data scientist to work with docker (I even wrote https://dagshub.com/blog/setting-up-data-science-workspace-with-docker/ ), but it is still a more advanced process for many people. Again, the thought is “Can I guarantee someone with no DevOps experience can use this tool?” Scalability: I agree with your entire point, and we also gave you the checkmark here. If I missed something, please let me know.
To summarize, I honestly feel like you don’t come off less favorably than anyone else mentioned in the article. I’m open to continuing the discussion, as it advances our understanding of the field, and we’re also open to being convinced otherwise. I really appreciate the feedback.

  
  
Posted 3 years ago
154 Views
0 Answers
3 years ago
one year ago