Is Anyone Also Experiencing Network Error During Every Clearml Dataset Download? It'S Been A While And Almost Every Download Fails...

Answered

Is anyone also experiencing network error during every clearml dataset download? It's been a while and almost every download fails...

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BitterStarfish58
				
					0
					 × 1

Votes Newest

Answers 25

Say a 400+MB dataset. The download will fail at like 80MB. Doesn't matter whether using SDK or from clearML experiment page.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BitterStarfish58
				
					0
					 × 1

Hi BitterStarfish58
Where are you uploading it to?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

http://files.community.clear.ml

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BitterStarfish58
				
					0
					 × 1

Hmm BitterStarfish58 what's the error you are getting ?
Any chance you are over the free tier quota ?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Error message as the image says.
The file size is 415MB, but the download "succeeds" at 107MB.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BitterStarfish58
				
					0
					 × 1

It shows we're still in the free tier quota

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BitterStarfish58
				
					0
					 × 1

It might be the file upload was broken?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

https://clearml.slack.com/archives/CTK20V944/p1642735039222200?thread_ts=1642731461.221700&cid=CTK20V944
Like I said here, using browser doesn't work. It has the same behavior.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BitterStarfish58
				
					0
					 × 1

Pretty sure it's not the reason. Now I've encountered this issue on 5+ dataset I'm using on different projects. Some worked quite well before, but not recently.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BitterStarfish58
				
					0
					 × 1

Since the error says network error, is it possible because I'm in Taiwan? Like downloading from Asia leads to this kind of issue.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BitterStarfish58
				
					0
					 × 1

AgitatedDove14 Earlier my colleague said he managed to download the dataset with browser by keeping "resuming" the download once it stops due to network error. So no I don't think it's the problem of the file itself...

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BitterStarfish58
				
					0
					 × 1

BitterStarfish58 I would suspect the upload was corrupted (I think this is the discrepancy between the files size logged, to the actual file size uploaded)

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

So are you saying the large file size download is the issue ? (i.e. network issues)

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AgitatedDove14 In our case, redownload doesn't help because it leads to the same result. The download gets interrupted due to network error.
But it's a good start to tell finishing and succeeding apart

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BitterStarfish58
				
					0
					 × 1

Hmm maybe we should add a test once the download is done, comparing the expected file size and the actual file size, and if they are different we should redownload ?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Regrading check, we can add one here
https://github.com/allegroai/clearml/blob/12fa7c92aaf8770d770c8ed05094e924b9099c16/clearml/storage/helper.py#L713

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

(currently I think the implementation expects that if the download completed, it was successful)

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I'm not sure the files-server supports "continue" from last position...

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Anyway, thanks for the help. As a workaround, we will avoid large file uploading from now on. Look forward to hearing from you if you guys manage to reproduce the issue or implement a fix.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BitterStarfish58
				
					0
					 × 1

I'm not familiar with that either. But downloading with chrome browser and some perseverance to keep clicking continue does work. It's quite cool.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BitterStarfish58
				
					0
					 × 1

BitterStarfish58 could you open a GitHub issue on it? I really want to make sure we support it (and I think it should not be very difficult)

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AgitatedDove14
https://github.com/allegroai/clearml/issues/552
Just did. Hope the format looks okay.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BitterStarfish58
				
					0
					 × 1

Thanks BitterStarfish58 !

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Since the error says network error, is it possible because I'm in Taiwan? Like downloading from Asia leads to this kind of issue

Can you download it from the browser ? (I mean the file size after download , is it 400mb?)

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AgitatedDove14 Yes I think that's the problem. And if there's also a way to keep resuming the download when using sdk, our python code will work like before. That's basically all we need.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BitterStarfish58
				
					0
					 × 1

Write your answer

1K Views

25 Answers

2 years ago

one year ago