Hi, Is There Any Code Examples Of How Dataops Is Being Established?

Answered

Hi, is there any code examples of how DataOps is being established?
https://clear.ml/products/clearml-dataops/

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					SubstantialElk6
				
					0
					 × 1

Votes Newest

Answers 8

Create immutable and differentiable versions on-prem or in the cloud with our data agnostic solution.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					SubstantialElk6
				
					0
					 × 1

Hi SubstantialElk6
Generally speaking here, the idea is that actual code creates a Dataset (i.e. Dataset class created from code), plus you can add some metric reporting (like table reporting) to create a preview of the data stored for better visibility, or maybe create some statistics as part of the data ingest script. Then this ingest code can be relaunched / automated. The created Dataset itself can be tagged renamed added key/value for better cataloging. wdyt?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Share data across R&D teams with searchable data catalogs available on any environment.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					SubstantialElk6
				
					0
					 × 1

Yeah that'll cover the first two points, but I don't see how it'll end up as a dataset catalogue as advertised.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					SubstantialElk6
				
					0
					 × 1

Transform feature engineering and data processing code into recurring data ingestion workflows. Start building data stores, develop, automate, and schedule complex data processing jobs.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					SubstantialElk6
				
					0
					 × 1

dataset catalogue as advertised.

Creating the Dataset on ClearML, is the catalog, you can move datasets around, put in sub-folders add tags add meta-data, search etc. I think this qualifies as a dataset catalog , no?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

The first is probably done using pipeline controllers, the second using Datasets or HyperDatasets. Its not very clear how the last one is achieved, especially on the searchable data catalogs.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					SubstantialElk6
				
					0
					 × 1

I see. Is there a more elaborate codeset that describes the above interactions?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					SubstantialElk6
				
					0
					 × 1

Write your answer

2K Views

8 Answers

4 years ago

2 years ago