Hey all, I've been doing some hyperparameter tuning today using a clearML managed dataset. However, I kept running into the following error: Error loading dataset: pickle data was truncated
.
After a bit of digging I found out this is likely due to the fact that all instances appear to be writing to the same file in the .cache
folder, namely, '/home/usr/.clearml/cache/storage_manager/datasets/ds_dataset_id'
. When multiple processes run at the same time some processes are still writing to this file while others are already trying to fetch it, causing the error. This seemed like a pretty common issue to me and I thought there would probably be a workaround for this but I am unable to find it. Could anyone here advice me on how to deal with this issue?
CC @<1545216070686609408:profile|EnthusiasticCow4>