Since the "grand" dataset will inherit from the child versions you wouldn't need to have data duplications
Apparently found out a solution:dataset_zip = dataset._task.artifacts['data'].get()
will return the path to the zip file containing all the files (that will be downloaded to the local machine)
after that:import zipfile zip_file = zipfile.ZipFile(d, 'r') files = zip_file.namelist()
retrieving the names of the files
unzip usingimport os os.system(f'unzip {dataset_zip}') # in this case to your script directory
and using the files
list one can them open them selectively
Could you supply any reference of this dataset containing other datasets? I might have skipped that when reading the documentation, but I do not recall seeing this functionality.
ShallowGoldfish8 , I think the best would be storing them as separate datasets per day and then having a "grand" dataset that includes all days and new days are being added as you go.
What do you think?