Answered

Hey, Is There A Shortcut On The Dataset Sdk To Directly Get The Latest Version Of A Dataset ?

Hey, is there a shortcut on the Dataset SDK to directly get the latest version of a dataset ?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

Votes Newest

Answers 8

Hi FierceHamster54
Sure just do
dataset = Dataset.get(dataset_project="project", dataset_name="name")This will by default fetch the latest version

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hi AgitatedDove14 , is the Dataset.get will take all child too?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					QuaintJellyfish58
				
					0
					 × 1

And by extension is there a way to upsert a dataset by automatically creating an entry wich a incremented version or create it if it does not exists ? Or am I forced to do a get, check if the latest version is fainallyzed, then increment de version of that version and create my new version ?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

Or am I forced to do a get, check if the latest version is fainallyzed,

Dataset Must be finalized before using it. The only situation where it is not is because you are still in the "upload" state.

, then increment de version of that version and create my new version ?

I'm assuming there is a data processing pipeline pushing new data?! How do you know you have new data to push?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AgitatedDove14 I have annotation logs from the end-user that I fetch periodically, I process it and I want to add it as a new version of my dataset where all versions correspond to the data collected during a precise time window, currently I'm doing it by fetching the latest dataset, incrementing the versionmm and creating a new dataset version

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

currently I'm doing it by fetching the latest dataset, incrementing the version and creating a new dataset version

This seems like a very good approach, how would you improve ?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I would like instead of having to:
Fetch latest dataset to get the current latest version Increment the version number Create and upload a new version of the datasetTo be able to:
Select a dataset project by name Create a new version of the dataset by choosing what increment in SEMVER standard I would like to add for this version number (major/minor/patch) and upload

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					FierceHamster54
				
					0
					 × 1

Create a new version of the dataset by choosing what increment in SEMVER standard I would like to add for this version number (major/minor/patch) and uploadOh this is already there
` cur_ds = Dataset.get(dataset_project="project", dataset_name="name")

if version is not given it will auto increase based on semantic versions incrementing the last number 1.2.3 -> 1.2.4

new_ds = Dataset.create(dataset_project="project", dataset_name="name", parents=[cur_ds.id]) `

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Write your answer

2K Views

8 Answers

2 years ago