pruning old ancestors sounds like the right move for now.
Hey SuccessfulKoala55 this is a fairly small dataset with a linear hierarchy of ~300 version and a size of ~2GBs
In the meantime is there some way to set a retention policy for the dataset versions ?
Hi FierceHamster54 ! Looks like we pull all the ancestors of a dataset when we finalize. I think this can be optimized. We will keep you posted when we make some improvements
Or do I have to add pipeline step to prune ancestors that are too old ?
Hi FierceHamster54 , how big is the version hierarchy? Can you provide some details on the structure? Also, how many files are in the dataset and what are their sizes?