Hi everyone!
I have started using the StorageManager as a utility for my training code.
Before training starts, I use it to download the training data from S3, with its built in automatic local caching, which is great because it saves me the time of downloading the data for every single experiment.
I was wondering however, suppose the cache is empty, I launch a new training session and it starts downloading the data. Then, a minute later I launch a second training session that uses the same data.
The second script would observe the a "partial" file in the cache dir, would it start overwriting it on its own? will it wait for the first training session to finish the download? Is it safe at all to use it this way, in terms of race conditions?