We Are Planning To Use A Data Versioning System, Because Now We Are Having A Lot Of Folders With Different Names Which Basically Contain The Same Data, Only With Small Changes. The Most Prominent Candidates Are Clearml Data And Dvc. Could You Tell Me What

Unanswered

Hi GreasyPenguin14

Could you tell me what the differences are and why we should use ClearML data?

The first difference is in the approach itself, DVC ties the data with the code (i.e. git repo), where we (ClearML - but not just us) actually think data should be abstracted from the Code-Base and become a standalone argument, allowing users to build/execute against different dataset/versions. ClearML Data becomes part of the workflow as it is visible from the UI including the ability to create structures in projects/sub-projects, naming conversions tags etc. (In the upcoming versions we will be extending the UI visualization capabilities for even better visibility) ClearML data offers full programmatic interface, allowing you to easily build automation processes, directly from code Triggers now support launching Tasks based on new datasets created/tagged in the system (e.g. automation is built in) Users can customize Datasets and add metrics / visualization from code, for increased visibility (e.g. plot the first few lines of a table, upload image samples etc.)
I probably missed a few points, but this is probably a good start 🙂

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

205 Views

0 Answers

3 years ago

2 years ago