@<1523701070390366208:profile|CostlyOstrich36>
One more moment. When I look at the dataset in web UI, I see like dataset switched to final
@<1523701070390366208:profile|CostlyOstrich36> Then when I try to get the dataset I get the following error
Failed getting object size: RetryError('HTTPSConnectionPool(host='files.clearml.dbrain.io', port=443): Max retries exceeded with url: /Labeled%20datasets/.datasets/printed%20multilang%20crops/printed%20multilang%20crops.1e76fd4ad77f4d2790e4acf1c8241c59/artifacts/state/state.json (Caused by ResponseError('too many 503 error responses'))')
Could not download
, err: HTTPSConnectionPool(host='files.clearml.dbrain.io', port=443): Max retries exceeded with url: /Labeled%20datasets/.datasets/printed%20multilang%20crops/printed%20multilang%20crops.1e76fd4ad77f4d2790e4acf1c8241c59/artifacts/state/state.json (Caused by ResponseError('too many 502 error responses'))
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[34], line 1
----> 1 dataset1 = Dataset.get(dataset_id='1e76fd4ad77f4d2790e4acf1c8241c59')
File ~/miniconda3/lib/python3.9/site-packages/clearml/datasets/dataset.py:1731, in Dataset.get(cls, dataset_id, dataset_project, dataset_name, dataset_tags, only_completed, only_published, include_archived, auto_create, writable_copy, dataset_version, alias, overridable, shallow_search, **kwargs)
1727 instance = Dataset.create(
1728 dataset_name=dataset_name, dataset_project=dataset_project, dataset_tags=dataset_tags
1729 )
1730 return finish_dataset_get(instance, instance._id)
-> 1731 instance = get_instance(dataset_id)
1732 # Now we have the requested dataset, but if we want a mutable copy instead, we create a new dataset with the
1733 # current one as its parent. So one can add files to it and finalize as a new version.
1734 if writable_copy:
File ~/miniconda3/lib/python3.9/site-packages/clearml/datasets/dataset.py:1643, in Dataset.get.<locals>.get_instance(dataset_id_)
1635 local_state_file = StorageManager.get_local_copy(
1636 remote_url=task.artifacts[cls.__state_entry_name].url,
1637 cache_context=cls.__cache_context,
(...)
1640 force_download=force_download,
1641 )
1642 if not local_state_file:
-> 1643 raise ValueError("Could not load Dataset id={} state".format(task.id))
1644 else:
1645 # we could not find the serialized state, start empty
1646 local_state_file = {}
ValueError: Could not load Dataset id=1e76fd4ad77f4d2790e4acf1c8241c59 state
Hi @<1524560082761682944:profile|MammothParrot39> , I have the same problem. Could you elaborate on the solution you found?
Problem solved:
- removed limits everywhere and live time for downloading everywhere
- increased limits for the file server
Hi @<1524560082761682944:profile|MammothParrot39> , I think that might be the issue. Is this a self deployed server?
@<1523701070390366208:profile|CostlyOstrich36> Yes, we deployed clearml in our outline