I have tried:
Airflow - Pain to setup, old UI and other problems
Prefect - Literaly just tried to setup a simple distributed system, took me a week, I do not recommend this tool at all, horrible documentation, noone helps at slack.
Dagster - Absolute beauty, nice UI, easy to setup (as a pip package or just a docker + postgres), i highly recommend this tool. Takes a bit to get used to it. I will in coming week try this combo of dagster + clearml, where i periodically check some things and if i met some criteria I will spawn clearml jobs that will be put into clearml queue and executed.
Dang! @<1590514584836378624:profile|AmiableSeaturtle81> awesome answer thank you! You seem like an awesome person to know. Definitely connect if you'd like to talk ops stuff sometime. None
Hey @<1523701482157772800:profile|AnxiousSeal95> ! I think ClearML's orchestrator is a great fit for ad-hoc experimentation, but not for (event-triggered) batch inference jobs that need to be relied on in production.
I'd only feel comfortable supporting pipelines that serve end users on a tool that is known for that, e.g. Metaflow, Dagster, or Airflow--mainly because those tools emphasize good monitoring and integration with the wider data ecosystem.
@<1523701482157772800:profile|AnxiousSeal95> I see a lot of people here migrating data from one data source to another.
For us it was that we experimented with Clearml to get the feeling and we used clearml built in file storage to save debug images an all other artifacts.
Then we grew rapidly and we had to migrate to S3 storage.
I had to write a script that goes through elasticsearch and mongo db to point to new S3 links wher the data was migrated to.
I do however understand that migration in itself is not easy and there isnt a magical button to solve this issue. However, exposed API that could change the artifact file path prefix maybe could be useful
I'm also curious about using external orchestrators as opposed to the ClearML's built-in ones
@<1541954607595393024:profile|BattyCrocodile47> Thanks a lot for the explanation! These inputs help us a lot building our tools, and eventually, building user's trust in them 🙂 Let us know with what orchestrator you ended up with and how it's going!
@<1590514584836378624:profile|AmiableSeaturtle81> yeah I can see what you mean. So you reuploaded everything from the ClearML file server into S3 and just changed the links?
I've also used Airflow and Dagster in prod, but not integrated them with an exp tracker.
@<1590514584836378624:profile|AmiableSeaturtle81> Cool to see the community building such things! 🙂 If this works out for you, we'll be happy if you share your process!
A question both to you and @<1541954607595393024:profile|BattyCrocodile47> , what compels you to use a different orchestrator? Anything missing from the ClearML orchestration layer?