Unanswered
Any Info On The Lifecycle Of Datasets Downloaded To $Home/.Clearml/Cache/Storage_Manager/Datasets Via Get_Local_Copy
I Have A Task Running And I Was Watching The Above Path And Datasets Were Being Downloaded And Then They Are All Removed And For A Partic
AgitatedDove14 - this was an interesting one. I think I have found the issue, but verifying the fix as of now.
One of the devs was using shutil.copy2
to copy parts of dataset to a temporary directory in a with
block - something like:
with TemporaryDirectory(dir=temp_dir) as certificates_directory: for file in test_paths: shutil.copy2(f"{dataset_local}/{file}", f"{certificates_directory}/file")
My suspicion is since copy2 copies with full data and symlinks, it’s messing with the way clearml-data sets up datasets with symlinks to parents etc and when the temp directories are cleaned due to the with
blocks the local copies are cleared too
151 Views
0
Answers
3 years ago
one year ago
Tags