Seems like it does not let to use ability of clearml to track and version datasets. I mean, I can't create next version of dataset from dataset with external files
Why does it matter how clearml stores datasets? If you get the dataset locally, all files will be unzipped.
- It takes time to compress. 8 archives , 5gb each , takes half of hour.
- I can stream archives from bucket directly to network for training without getting them locally, which saves storage space
Hi @<1523702307240284160:profile|TeenyBeetle18> , if they are already on GS then you can use add_external_files to log them.
None
What do you think?
Why does it matter how clearml stores datasets? If you get the dataset locally, all files will be unzipped.