Hi @<1523702000586330112:profile|FierceHamster54> , how big is the version hierarchy? Can you provide some details on the structure? Also, how many files are in the dataset and what are their sizes?
pruning old ancestors sounds like the right move for now.
Or do I have to add pipeline step to prune ancestors that are too old ?
Thanks a lot @<1523701435869433856:profile|SmugDolphin23> ❤
Hey @<1523701087100473344:profile|SuccessfulKoala55> this is a fairly small dataset with a linear hierarchy of ~300 version and a size of ~2GBs
In the meantime is there some way to set a retention policy for the dataset versions ?
Hi @<1523702000586330112:profile|FierceHamster54> ! Looks like we pull all the ancestors of a dataset when we finalize. I think this can be optimized. We will keep you posted when we make some improvements