Why does it matter how clearml stores datasets? If you get the dataset locally, all files will be unzipped.
- It takes time to compress. 8 archives , 5gb each , takes half of hour.
- I can stream archives from bucket directly to network for training without getting them locally, which saves storage space
Hi @<1523702307240284160:profile|TeenyBeetle18> , if they are already on GS then you can use add_external_files to log them.
None
What do you think?
Why does it matter how clearml stores datasets? If you get the dataset locally, all files will be unzipped.
Seems like it does not let to use ability of clearml to track and version datasets. I mean, I can't create next version of dataset from dataset with external files