Hey! Starting An Mlops Director Position In 2 Weeks. I'M Thinking About Architecture. Has Anyone Ever Tried To Use Clearml As An Experiment Tracker, But Used A Different Orchestrator Like Metaflow, Airflow, Prefect, Etc.? I'M Struggling To Find Guides Or

Answered

Hey! Starting an MLOps Director position in 2 weeks. I'm thinking about architecture.

Has anyone ever tried to use ClearML as an experiment tracker, but used a DIFFERENT orchestrator like Metaflow, Airflow, Prefect, etc.? I'm struggling to find guides or "hot takes" online for this.

  				
Posted 
	one year ago

					More
				  		
  Report
		
					BattyCrocodile47
				
					0
					 × 1

Votes Newest

Answers 9

I'm also curious about using external orchestrators as opposed to the ClearML's built-in ones

  				
Posted 
	one year ago

					More
				  		
  Report
		
					AnxiousSeal95
				
					0
					 × 1

Dang! @<1590514584836378624:profile|AmiableSeaturtle81> awesome answer thank you! You seem like an awesome person to know. Definitely connect if you'd like to talk ops stuff sometime. None

  				
Posted 
	one year ago

					More
				  		
  Report
		
					BattyCrocodile47
				
					0
					 × 1

I've also used Airflow and Dagster in prod, but not integrated them with an exp tracker.

  				
Posted 
	one year ago

					More
				  		
  Report
		
					BattyCrocodile47
				
					0
					 × 1

@<1523701482157772800:profile|AnxiousSeal95> I see a lot of people here migrating data from one data source to another.
For us it was that we experimented with Clearml to get the feeling and we used clearml built in file storage to save debug images an all other artifacts.

Then we grew rapidly and we had to migrate to S3 storage.
I had to write a script that goes through elasticsearch and mongo db to point to new S3 links wher the data was migrated to.
I do however understand that migration in itself is not easy and there isnt a magical button to solve this issue. However, exposed API that could change the artifact file path prefix maybe could be useful

  				
Posted 
	one year ago

					More
				  		
  Report
		
					AmiableSeaturtle81
				
					0
					 × 1

@<1590514584836378624:profile|AmiableSeaturtle81> yeah I can see what you mean. So you reuploaded everything from the ClearML file server into S3 and just changed the links?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					AnxiousSeal95
				
					0
					 × 1

@<1541954607595393024:profile|BattyCrocodile47> Thanks a lot for the explanation! These inputs help us a lot building our tools, and eventually, building user's trust in them 🙂 Let us know with what orchestrator you ended up with and how it's going!

  				
Posted 
	one year ago

					More
				  		
  Report
		
					AnxiousSeal95
				
					0
					 × 1

@<1590514584836378624:profile|AmiableSeaturtle81> Cool to see the community building such things! 🙂 If this works out for you, we'll be happy if you share your process!

A question both to you and @<1541954607595393024:profile|BattyCrocodile47> , what compels you to use a different orchestrator? Anything missing from the ClearML orchestration layer?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					AnxiousSeal95
				
					0
					 × 1

I have tried:
Airflow - Pain to setup, old UI and other problems

Prefect - Literaly just tried to setup a simple distributed system, took me a week, I do not recommend this tool at all, horrible documentation, noone helps at slack.

Dagster - Absolute beauty, nice UI, easy to setup (as a pip package or just a docker + postgres), i highly recommend this tool. Takes a bit to get used to it. I will in coming week try this combo of dagster + clearml, where i periodically check some things and if i met some criteria I will spawn clearml jobs that will be put into clearml queue and executed.

  				
Posted 
	one year ago

					More
				  		
  Report
		
					AmiableSeaturtle81
				
					0
					 × 1

Hey @<1523701482157772800:profile|AnxiousSeal95> ! I think ClearML's orchestrator is a great fit for ad-hoc experimentation, but not for (event-triggered) batch inference jobs that need to be relied on in production.

I'd only feel comfortable supporting pipelines that serve end users on a tool that is known for that, e.g. Metaflow, Dagster, or Airflow--mainly because those tools emphasize good monitoring and integration with the wider data ecosystem.

  				
Posted 
	one year ago

					More
				  		
  Report
		
					BattyCrocodile47
				
					0
					 × 1

Write your answer

1K Views

9 Answers

one year ago