Hello, I Have A General Question About Data Versioning Using Clearml. When Lets Say That My Parent Dataset Has 100 Files, And That I Create A Child Dataset From It By Adding An Extra 50 Files To The Original 100. Will My 100 Files Be Duplicated On My Serv

Answered

hello, i have a general question about data versioning using ClearML.
When lets say that my parent dataset has 100 files, and that I create a child dataset from it by adding an extra 50 files to the original 100. Will my 100 files be duplicated on my server?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					MassiveGoldfish6
				
					0
					 × 1

Votes Newest

Answers 5

so if my parent dataset is 1Tb and I add a single file to create a child dataset. There will now be 2Tb of data on the server. The parent dataset is duplicated on the server?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					MassiveGoldfish6
				
					0
					 × 1

Hi @<1547028031053238272:profile|MassiveGoldfish6> , yes, every new version contains all included files

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

No, it should be just the amount of files remaining

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

is this still true if the child dataset is smaller than the parent? If the parent dataset is 1Tb and I delete half the files, I will still be pushing 2Tb of data to the server?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					MassiveGoldfish6
				
					0
					 × 1

Yes. Differential Datasets are part of the ClearML Scale and Enterprise solution 😞

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Write your answer

2K Views

5 Answers

2 years ago