Two Questions About Datasets: Question 1: Are Parallel Writes To A Dataset With The Same Version Possible? Is The Way To Go, To Have A Task, Which Creates A Dataset Object, Which In Turn Is Passed As Artifact To The Subsequent Ingestion Tasks? After The P

Unanswered

Hey AgitatedDove14 ,
sorry, I am quite new to slack... forgot to submit my changes of the answer...

When you are saying parallel what do you mean? from multiple machines ?

yes, or (because I deployed clearml using helm in kubernetes) from the same machine, but multiple pods (tasks).

Once a dataset was finalized the only way to add files is to add another version that inherits from the previous one (i.e. the finalized version becomes the parent of the new version)
If you are worried about multiple versions, just like in git you have squeeze

okay, great. thank you so much!

The correct way would be to pas the Dataset ID, then other task would simple get it with Dataset.get
No need to worry about re-download, everything is automatically cached.

Sounds good, thanks for clarification.

  				
Posted 
	one year ago

					More  		
  Report
		
					SaltySpider22
				
					0
					 × 1

168 Views

0 Answers

one year ago