Hello! When I Squash Multiple Datasets (E.G.

Hello! When I squash multiple datasets (e.g. Dataset.squash(dataset_name="new_ds", dataset_ids=[id1, id2, id3]) , as far as I can see the newly created dataset does not track which datasets where squashed. Do I have to add that information manually via e.g. Dataset.set_description, or is there another way to track that info? This would be important for data lineage reasons I think. Thanks!

And another question regarding squashing: sometimes I get the following error: FileNotFoundError: [Errno 2] No such file or directory: '/home/vscode/.clearml/cache/storage_manager/datasets/ds_4f3436f7b3ef484f8148a9c25a444ee5/file.ann — why is there an attempt to access the file locally?

Hi SmallGiraffe94 ! Dataset.squash doesn't set as parents the ids you specify in dataset_ids . Also, notice that the current behaviour of squash is pulling the files from all the datasetes from a temp folder and re-uploading them. How about creating a new dataset with id1, id2, id3 as parents Dataset.create(..., parent_datasets=[id1, id2, id3]) instead? Would this fit your usecase?

Ah, I wasn’t aware this is possible! Yes, perfect, thanks a lot!

