Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
My Team Uses Metaflow By Outerbounds. Great Dag Tool. Super Robust. We Run Our Production Workloads On It And Use It For Experimentation, Too. I'M Considering Adding Clearml To Our Stack As An Exp Tracker / Model Registry Rather Than Going With The More

My team uses Metaflow by Outerbounds. Great DAG tool. Super robust. We run our production workloads on it and use it for experimentation, too.

I'm considering adding ClearML to our stack as an exp tracker / model registry rather than going with the more expensive WandB, Comet, Neptune.

Has anyone used ClearML for this use case? Any thoughts on strengths / weaknesses of only using the exp tracker / registry functions of ClearML?

If there's a strong case, then I think it'd make sense for

  • ZenML
  • Flyte
  • Kubeflow
  • Metaflow
  • Kubeflow
  • and users of most other dag tools
    To adopt ClearML as well
  
  
Posted 2 months ago
Votes Newest

Answers 5


Like, if you google "dagster and clearml" or "prefect and clearml" or "airflow and clearml" -- I don't find any blogs written by people talking about how they use both of them together.

Oh yeah I see your point, I think the main reason is a lot of the dag capabilities and the orchestration is already folded into clearml's capabilities (i.e. pipelines + clearml-agent etc.)
That said I'm pretty sure I have seen just adding Task.init into each of a the framework above steps, in order to track the individual execution with higher degree of visibility (e.g. resource monitoring, artifacts, scalars etc).
One thing that you could do in order to also have some"dag" visibility inside clearml even when running dags with metaflow is using the "parent" property of the Task, point back to the parent Task in the dag, which means that you could trace back from each step the creating steps

So my point was: if ClearML can work well with Metaflow, it should be able to work well with any of the others, which I think would be great.

Correct, for example Sagemaker as a veri different example of dag/orchestration

And it also makes me wonder: why?? Why is it that seemingly nobody is using ClearML together with another DAG tool? Does it not make sense for some reason? Or is it that no one has explored it?

see my point above, pipelines/dags are already included, also supporting Logic not just dag, which allows for great flexibility

We've got some pressure internally to come up with something. The default is MLflow.

I think it's just missing some of the capabilities of ClearML, but diffidently a valid solution. If large scale is never a target, then for sure if it is easier and you do not mind too many solutions to manage.

  
  
Posted 2 months ago

you mean as experiment management / model registry / data? I think this is the bread&butter of clearml

💯 . I was wondering if anyone had had experience using ClearML together with one of these others.

I think most of them are alternatives to metaflow

Totally.

Like, if you google "dagster and clearml" or "prefect and clearml" or "airflow and clearml" -- I don't find any blogs written by people talking about how they use both of them together.

That's strange to me, because if you search for "mlflow and ___" you'll nearly always find something.

So my point was: if ClearML can work well with Metaflow, it should be able to work well with any of the others, which I think would be great.

And it also makes me wonder: why?? Why is it that seemingly nobody is using ClearML together with another DAG tool? Does it not make sense for some reason? Or is it that no one has explored it?

Trying to figure that out. It's on my list to try it myself, but may not get to it for a while. We've got some pressure internally to come up with something. The default is MLflow.

  
  
Posted 2 months ago

Hi BattyCrocodile47

Has anyone used ClearML for this use case?

you mean as experiment management / model registry / data? I think this is the bread&butter of clearml 🙂
regrading the other options ion the list, I think most of them are alternatives to metaflow, not covering the parts you mentioned, no?

  
  
Posted 2 months ago

Thanks for this!! I may try it and if I do and it works I’ll look into writing a plugin for ZenML and Metaflow that auto initializes the parent task and registers the steps as child tasks. Super helpful thank you!

  
  
Posted 2 months ago

That's a very neat solution! maybe there's a way to inject "Task.init" into the code through a plugin, or worst case push it into some internal base package, and only call it when the code is orchestrated automatically (usually there is a an environment variable that is set to signal that, like CI_something )

  
  
Posted 2 months ago
227 Views
5 Answers
2 months ago
2 months ago
Tags