Hello Everyone! I'M Uploading A Dataset To The Server. During Dataset.Finalize() I Get The Following Error. Could You Please Advise What Could Be The Problem, Where Should I Start Looking?

Answered

Hello everyone! I'm uploading a dataset to the server. During dataset.finalize() I get the following error. Could you please advise what could be the problem, where should I start looking?

2023-06-13 18:33:44,770 - clearml.Task - ERROR - Action failed <413/0: tasks.edit (<html>
<head><title>413 Request Entity Too Large</title></head>
<body>
<center><h1>413 Request Entity Too Large</h1></center>
<hr><center>nginx</center>
</body>
</html>
)> (task=1e76fd4ad77f4d2790e4acf1c8241c59, force=True, configuration={'Dataset Struct': {'name': 'Dataset Struct', 'value': '{\n  "0": {\n    "job_id": "1e76fd4ad77f4d2790e4acf1c8241c59",\n    "status": "in_progress",\n    "last_update": 1686681042,\n    "parents": [],\n    "job_size": 11112849496,\n    "name": "printed multilang crops",\n    "version": "1.0.0"\n  }\n}', 'type': 'json', 'description': 'Structure of the dataset'}, 'Dataset Content': {'name': 'Dataset Content', 'value': 'File Name (2360748 files), File Size (total 10.35 GB), Hash (SHA2)\ncontracts/ .........

The peculiarity of this dataset is that it weighs around 10GB and has 2.5 million files. I also have some thoughts on this matter. Am I correct in understanding that during dataset.finalize(), a meta file with dataset information is sent to the server? If we have a file with information on 2.5 million files, then the meta file will be too large. Could this cause an error?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					MammothParrot39
				
					0
					 × 1

Votes Newest

Answers 6

@<1523701070390366208:profile|CostlyOstrich36>
One more moment. When I look at the dataset in web UI, I see like dataset switched to final

  				
Posted 
	one year ago

					More
				  		
  Report
		
					MammothParrot39
				
					0
					 × 1

@<1523701070390366208:profile|CostlyOstrich36> Then when I try to get the dataset I get the following error

Failed getting object size: RetryError('HTTPSConnectionPool(host='files.clearml.dbrain.io', port=443): Max retries exceeded with url: /Labeled%20datasets/.datasets/printed%20multilang%20crops/printed%20multilang%20crops.1e76fd4ad77f4d2790e4acf1c8241c59/artifacts/state/state.json (Caused by ResponseError('too many 503 error responses'))')
Could not download

 , err: HTTPSConnectionPool(host='files.clearml.dbrain.io', port=443): Max retries exceeded with url: /Labeled%20datasets/.datasets/printed%20multilang%20crops/printed%20multilang%20crops.1e76fd4ad77f4d2790e4acf1c8241c59/artifacts/state/state.json (Caused by ResponseError('too many 502 error responses')) 
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[34], line 1
----> 1 dataset1 = Dataset.get(dataset_id='1e76fd4ad77f4d2790e4acf1c8241c59')

File ~/miniconda3/lib/python3.9/site-packages/clearml/datasets/dataset.py:1731, in Dataset.get(cls, dataset_id, dataset_project, dataset_name, dataset_tags, only_completed, only_published, include_archived, auto_create, writable_copy, dataset_version, alias, overridable, shallow_search, **kwargs)
   1727     instance = Dataset.create(
   1728         dataset_name=dataset_name, dataset_project=dataset_project, dataset_tags=dataset_tags
   1729     )
   1730     return finish_dataset_get(instance, instance._id)
-> 1731 instance = get_instance(dataset_id)
   1732 # Now we have the requested dataset, but if we want a mutable copy instead, we create a new dataset with the
   1733 # current one as its parent. So one can add files to it and finalize as a new version.
   1734 if writable_copy:

File ~/miniconda3/lib/python3.9/site-packages/clearml/datasets/dataset.py:1643, in Dataset.get.<locals>.get_instance(dataset_id_)
   1635     local_state_file = StorageManager.get_local_copy(
   1636         remote_url=task.artifacts[cls.__state_entry_name].url,
   1637         cache_context=cls.__cache_context,
   (...)
   1640         force_download=force_download,
   1641     )
   1642     if not local_state_file:
-> 1643         raise ValueError("Could not load Dataset id={} state".format(task.id))
   1644 else:
   1645     # we could not find the serialized state, start empty
   1646     local_state_file = {}

ValueError: Could not load Dataset id=1e76fd4ad77f4d2790e4acf1c8241c59 state

  				
Posted 
	one year ago

					More
				  		
  Report
		
					MammothParrot39
				
					0
					 × 1

Hi @<1524560082761682944:profile|MammothParrot39> , I have the same problem. Could you elaborate on the solution you found?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					MoodySeaurchin4
				
					0
					 × 1

Problem solved:

removed limits everywhere and live time for downloading everywhere
increased limits for the file server

  				
Posted 
	one year ago

					More
				  		
  Report
		
					MammothParrot39
				
					0
					 × 1

Hi @<1524560082761682944:profile|MammothParrot39> , I think that might be the issue. Is this a self deployed server?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

@<1523701070390366208:profile|CostlyOstrich36> Yes, we deployed clearml in our outline

  				
Posted 
	one year ago

					More
				  		
  Report
		
					MammothParrot39
				
					0
					 × 1

Write your answer

1K Views

6 Answers

one year ago