Let’s say I have a dataset from source A, dataset is finalised, upload and looks like this:train_data/data_from_source_A
Each month I receive new batch of data, create new dataset and upload it. And after few months my dataset looks like this:train_data/data_from_source_A train_data/data_from_source_B train_data/data_from_source_C train_data/data_from_source_D train_data/data_from_source_E
Each batch of data was added via creating a new dataset and adding files. Now, I have a large dataset. I can download whole data to local server and start training. Let’s say I found out that data in data_from_source_C
has some issue. I want to let data engineer from my team download exactly this folder and fix issue (it can be anything). How to do this without downloading whole dataset?