Hey! Starting An Mlops Director Position In 2 Weeks. I'M Thinking About Architecture. Has Anyone Ever Tried To Use Clearml As An Experiment Tracker, But Used A Different Orchestrator Like Metaflow, Airflow, Prefect, Etc.? I'M Struggling To Find Guides Or

Answered

Hey! Starting an MLOps Director position in 2 weeks. I'm thinking about architecture.

Has anyone ever tried to use ClearML as an experiment tracker, but used a DIFFERENT orchestrator like Metaflow, Airflow, Prefect, etc.? I'm struggling to find guides or "hot takes" online for this.

  				
Posted 
	5 months ago

					More  		
  Report
		
					BattyCrocodile47
				
					0
					 × 1

Votes Newest

Answers 9

BattyCrocodile47 Thanks a lot for the explanation! These inputs help us a lot building our tools, and eventually, building user's trust in them 🙂 Let us know with what orchestrator you ended up with and how it's going!

  				
Posted 
	5 months ago

					More  		
  Report
		
					AnxiousSeal95
				
					0
					 × 1

AmiableSeaturtle81 Cool to see the community building such things! 🙂 If this works out for you, we'll be happy if you share your process!

A question both to you and BattyCrocodile47 , what compels you to use a different orchestrator? Anything missing from the ClearML orchestration layer?

  				
Posted 
	5 months ago

					More  		
  Report
		
					AnxiousSeal95
				
					0
					 × 1

I have tried:
Airflow - Pain to setup, old UI and other problems

Prefect - Literaly just tried to setup a simple distributed system, took me a week, I do not recommend this tool at all, horrible documentation, noone helps at slack.

Dagster - Absolute beauty, nice UI, easy to setup (as a pip package or just a docker + postgres), i highly recommend this tool. Takes a bit to get used to it. I will in coming week try this combo of dagster + clearml, where i periodically check some things and if i met some criteria I will spawn clearml jobs that will be put into clearml queue and executed.

  				
Posted 
	5 months ago

					More  		
  Report
		
					AmiableSeaturtle81
				
					0
					 × 1

I've also used Airflow and Dagster in prod, but not integrated them with an exp tracker.

  				
Posted 
	5 months ago

					More  		
  Report
		
					BattyCrocodile47
				
					0
					 × 1

AmiableSeaturtle81 yeah I can see what you mean. So you reuploaded everything from the ClearML file server into S3 and just changed the links?

  				
Posted 
	5 months ago

					More  		
  Report
		
					AnxiousSeal95
				
					0
					 × 1

Dang! AmiableSeaturtle81 awesome answer thank you! You seem like an awesome person to know. Definitely connect if you'd like to talk ops stuff sometime. None

  				
Posted 
	5 months ago

					More  		
  Report
		
					BattyCrocodile47
				
					0
					 × 1

I'm also curious about using external orchestrators as opposed to the ClearML's built-in ones

  				
Posted 
	5 months ago

					More  		
  Report
		
					AnxiousSeal95
				
					0
					 × 1

AnxiousSeal95 I see a lot of people here migrating data from one data source to another.
For us it was that we experimented with Clearml to get the feeling and we used clearml built in file storage to save debug images an all other artifacts.

Then we grew rapidly and we had to migrate to S3 storage.
I had to write a script that goes through elasticsearch and mongo db to point to new S3 links wher the data was migrated to.
I do however understand that migration in itself is not easy and there isnt a magical button to solve this issue. However, exposed API that could change the artifact file path prefix maybe could be useful

  				
Posted 
	5 months ago

					More  		
  Report
		
					AmiableSeaturtle81
				
					0
					 × 1

Hey AnxiousSeal95 ! I think ClearML's orchestrator is a great fit for ad-hoc experimentation, but not for (event-triggered) batch inference jobs that need to be relied on in production.

I'd only feel comfortable supporting pipelines that serve end users on a tool that is known for that, e.g. Metaflow, Dagster, or Airflow--mainly because those tools emphasize good monitoring and integration with the wider data ecosystem.

  				
Posted 
	5 months ago

					More  		
  Report
		
					BattyCrocodile47
				
					0
					 × 1

Write your answer

421 Views

9 Answers

5 months ago