I am observing a strange behaviour when loading a dataset’s local copy.

dataset_path_training = cml_dataset_training.get_local_copy()
logger.info(f"Created local copy for training data at: {dataset_path_training}")
dataset_path_validation = cml_dataset_validation.get_local_copy()

After executing dataset_path_training = cml_dataset_training.get_local_copy() , I do have the data of the training-dataset in directory referred to by dataset_path_training.
After executing dataset_path_validation = cml_dataset_validatoin.get_local_copy() , I do have the data of the validation-dataset in the directory referred to by dataset_path_validation, BUT dataset_path_training is empty.

The only explanation I can find for myself is, that clearml automatically deletes the content of dataset_path_training, possibly to free storage space (although there is enough space on the disk …).
Can you explain this behaviour and suggest a solution to maintain both the training and the validation data?

Posted 6 months ago
Hi ObedientTurkey46 ! You could try increasing sdk.storage.cache.default_cache_manager_size to a very large number

Posted 6 months ago

Yes, that solved the problem. Thank you!

Posted 6 months ago
