Yes, I'm working with the latest commit. Anyway, I have tried to run
dataset.get_local_copy() on another machine and it works. I have no idea why this happens. However, on the new machine
get_local_copy() does not return the path I expect. If I have this code:
dataset.upload( output_url="/home/user/server_local_storage/mock_storage" )I would expect the dataset to be stored under the path specified in
output_url . But what I get with
get_local_copy() is the following path:
Is this usual?
But what I get with
is the following path: ...
Get local path will return an immutable copy of the dataset, by definition this will not be the "source" storing the data.
(Also notice that the dataset itself is stored in zip files, and when you get the "local-copy" you get the extracted files)
Make sense ?
too large to be stored in the .cache path? It will be stored there anyway?
oh that is exactly why the latest release supports chunks, so you can get a partial copy 🙂
nonetheless, the assumption is that you will have to end up with the data locally, otherwise the network becomes a huge bottleneck
make sense ?
Indeed it does! But what still puzzles me so badly is why I get below path when running
dataset.get_local_copy() on one of the machines of my cluster:
Why is it pointing to a .lock file?
AgitatedDove14 Oops, something still seems to be wrong. When trying to retrieve the dataset using get_local_copy() I get the following error:
Traceback (most recent call last): File "/home/user/myproject/lab.py", line 27, in <module> print(dataset.get_local_copy()) File "/home/user/.conda/envs/myenv/lib/python3.9/site-packages/clearml/datasets/dataset.py", line 554, in get_local_copy target_folder = self._merge_datasets( File "/home/user/.conda/envs/myenv/lib/python3.9/site-packages/clearml/datasets/dataset.py", line 1342, in _merge_datasets target_base_folder = self._create_ds_target_folder( File "/home/user/.conda/envs/myenv/lib/python3.9/site-packages/clearml/datasets/dataset.py", line 1291, in _create_ds_target_folder cache.lock_cache_folder(local_folder) File "/home/user/.conda/envs/myenv/lib/python3.9/site-packages/clearml/storage/cache.py", line 248, in lock_cache_folder lock.acquire(timeout=0) File "/home/user/.conda/envs/myenv/lib/python3.9/site-packages/clearml/utilities/locks/utils.py", line 130, in acquire fh = self._get_fh() File "/home/user/.conda/envs/myenv/lib/python3.9/site-packages/clearml/utilities/locks/utils.py", line 200, in _get_fh return open(self.filename, self.mode, **self.file_open_kwargs) FileNotFoundError: [Errno 2] No such file or directory: '/home/user/.clearml/cache/storage_manager/datasets/.lock.000.ds_38e9acc8d56441999e806815abddee82.clearml'Main code is the same as above, I'm just adding
dataset.get_local_copy() at the end. It seems it resolves the path with a .lock file. Weird...
FileNotFoundError: [Errno 2] No such file or directory: '/home/user/.clearml/cache/storage_manager/datasets/.lock.000.ds_38e9acc8d56441999e806815abddee82.clearml'
Let me check this issue, it seems like the locking mechanism should have figured that there is no lock...