Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
Can You Help Me Make The Case For Clearml Pipelines/Tasks Vs Metaflow? Context Within...


Oh this is thought provoking. Yeah, the idea of using ClearML for R&D is super appealing (to me speaking as an MLOps engineer πŸ˜† ). And having the power of Metaflow's scheduler (on Step Functions with Event Bridge since we'd do the AWS-native deployment) also makes sense to me.

I'll keep asking questions about how we could do event-based jobs with alerting built in on ClearML in a different thread later on.


I pasted your points (anonymously) onto the Metaflow slack to let them speak to any updates that have happened in their product. If you care to read it, this is about as accurate a view as you can get on what Metaflow is today since these were written by a Metaflow founder and core contributor πŸ˜„


Person 1:

Point by point:

  • not true β€” you can specify the image you want for each step

  • accurate

  • not sure what that means

  • there are cards and UI and integrations with other tools like Comet. So probably more limited than some and less limited than others πŸ™‚

  • I’ll let the OB folks comment on this but yes, I think kube support is probably the most fleshed out (pure AWS is also pretty good since that is where it started πŸ™‚ )

  • correct β€” it’s a feature actually. We did discuss this quite a bit and it is really hard to guarantee side-effect free execution in python

  • I’ll let OB comment on this.
    Person 2:

  • re: caching - resume does what most systems mean by caching but like Romain mentioned, we don't make it overly magical as a feature

  • re: kubernetes - @batch and step-functions are still great options which don't require K8s. I'd agree that the deployment is not trivial in the literal sense of the word πŸ™‚ The terraform templates make it quite easy though

  • re: role-base access control - see Outerbounds Platform that provides a layer of security and auth features required by enterprises

  • "R&D to production acceleration" is what Metaflow has been about since the very beginning .
    It is true though that there are plenty of tools targeting data scientists which provide a nice GUI that make it easier to get started with a few clicks - DataRobot is a great example!

While tools like these seem appealing at the first sight, often they have hard time supporting real-world production use cases with constantly changing data, involved business logic, larger scale, and multiple people working together.

Real-world ML systems shouldn't be islands. They must work well with the surrounding infrastructure and policies. Metaflow is serious about providing a solution that balances requirements both on the engineering as well as on the data science side - so data scientists can develop systems that engineers can happily approve - which might contribute to the impression that "Metaflow is designed with more "devops" in mind".

tl;dr Metaflow is designed with both devops and data scientists in mind!

  
  
Posted 11 months ago
97 Views
0 Answers
11 months ago
11 months ago