Like, if you google "dagster and clearml" or "prefect and clearml" or "airflow and clearml" -- I don't find any blogs written by people talking about how they use both of them together.
Oh yeah I see your point, I think the main reason is a lot of the dag capabilities and the orchestration is already folded into clearml's capabilities (i.e. pipelines + clearml-agent etc.)
That said I'm pretty sure I have seen just adding Task.init into each of a the framework above steps, in order to track the individual execution with higher degree of visibility (e.g. resource monitoring, artifacts, scalars etc).
One thing that you could do in order to also have some"dag" visibility inside clearml even when running dags with metaflow is using the "parent" property of the Task, point back to the parent Task in the dag, which means that you could trace back from each step the creating steps
So my point was: if ClearML can work well with Metaflow, it should be able to work well with any of the others, which I think would be great.
Correct, for example Sagemaker as a veri different example of dag/orchestration
And it also makes me wonder: why?? Why is it that seemingly nobody is using ClearML together with another DAG tool? Does it not make sense for some reason? Or is it that no one has explored it?
see my point above, pipelines/dags are already included, also supporting Logic not just dag, which allows for great flexibility
We've got some pressure internally to come up with something. The default is MLflow.
I think it's just missing some of the capabilities of ClearML, but diffidently a valid solution. If large scale is never a target, then for sure if it is easier and you do not mind too many solutions to manage.
you mean as experiment management / model registry / data? I think this is the bread&butter of clearml
💯 . I was wondering if anyone had had experience using ClearML together with one of these others.
I think most of them are alternatives to metaflow
Totally.
Like, if you google "dagster and clearml" or "prefect and clearml" or "airflow and clearml" -- I don't find any blogs written by people talking about how they use both of them together.
That's strange to me, because if you search for "mlflow and ___" you'll nearly always find something.
So my point was: if ClearML can work well with Metaflow, it should be able to work well with any of the others, which I think would be great.
And it also makes me wonder: why?? Why is it that seemingly nobody is using ClearML together with another DAG tool? Does it not make sense for some reason? Or is it that no one has explored it?
Trying to figure that out. It's on my list to try it myself, but may not get to it for a while. We've got some pressure internally to come up with something. The default is MLflow.
Has anyone used ClearML for this use case?
you mean as experiment management / model registry / data? I think this is the bread&butter of clearml 🙂
regrading the other options ion the list, I think most of them are alternatives to metaflow, not covering the parts you mentioned, no?
Thanks for this!! I may try it and if I do and it works I’ll look into writing a plugin for ZenML and Metaflow that auto initializes the parent task and registers the steps as child tasks. Super helpful thank you!
That's a very neat solution! maybe there's a way to inject "Task.init" into the code through a plugin, or worst case push it into some internal base package, and only call it when the code is orchestrated automatically (usually there is a an environment variable that is set to signal that, like CI_something )