ClearML FAQ | "Clearml.Task - Error - Action Failed <500/0: Tasks.Edit/V1.0 (Update Failed (Bsonobj Size: 18330801 (0X117B4B1) Is Invalid. Size Must Be Between 0 And 16793600(16Mb) F"

Answered

"Clearml.Task - Error - Action Failed <500/0: Tasks.Edit/V1.0 (Update Failed (Bsonobj Size: 18330801 (0X117B4B1) Is Invalid. Size Must Be Between 0 And 16793600(16Mb) F"

"clearml.Task - ERROR - Action failed <500/0: tasks.edit/v1.0 (Update failed (BSONObj size: 18330801 (0x117B4B1) is invalid. Size must be between 0 and 16793600(16MB) F"

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DrabOwl94
				
					0
					 × 1

Votes Newest

Answers 25

AgitatedDove14
Hello, Martin. Any news about this issue?

We really want to use ClearML for datasets that are hundreds GB worth of data.

Are you saying the ClearML is not able to do that?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CharmingStarfish14
				
					0
					 × 1

DrabOwl94 can you attach a code snippet? This error basically means you've hit the maximum size allowed for the task's BSON document, but the dataset itself should be uploaded as an artifact

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

256GB in total of data

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DrabOwl94
				
					0
					 × 1

So you are saying 156 chunks, with each chunk about ~6500 files ?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Martin you didn't get me right. We have 1 million small files which we upload in chunks of 512 mb

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DrabOwl94
				
					0
					 × 1

AgitatedDove14

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DrabOwl94
				
					0
					 × 1

the files are uploaded but metadata is absent 😞

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DrabOwl94
				
					0
					 × 1

DrabOwl94 how many 1M files did you end up having ?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

or we create parent - child 2 datasets splitting the set to two parts

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DrabOwl94
				
					0
					 × 1

all metadata that standard for clearml dataset: hashes , tempstamps and names of the 1M uploaded files

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DrabOwl94
				
					0
					 × 1

Hi DrabOwl94 , how did you create/save/finalize the dataset?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

Sure, AgitatedDove14 !

I will get to it next week. Thank you for the answer!

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CharmingStarfish14
				
					0
					 × 1

~2000 files in each chunk

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DrabOwl94
				
					0
					 × 1

CharmingStarfish14 can you check something from code, just to see if this would solve the issue?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

as we see it the only way is to split this dataset to smaller sub-datasets

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DrabOwl94
				
					0
					 × 1

what I meant is that we have 1,000,000 small files in the dataset

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DrabOwl94
				
					0
					 × 1

clearml-data create --name [Dataset Name] --project [Project Name] --output-uri clearml-data add --files [FILE_PATH] --id [Id] clearml-data close

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DrabOwl94
				
					0
					 × 1

so 78000 entries ...
wow a lot! would it makes sens to do 1G chunks ? any reason for the initial 1Mb chunk size ?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hi DrabOwl94
I think that if I understand you correctly you have a Lot of chunks (which translates to a lot of links to small 1MB files, because this is how you setup the chunk size). Now apparently you have reached the maximum number of chunks per specific Dataset version (at the end this meta-data is stored in a document with limited size, specifically 16MB).
How many chunks do you have there?
(In other words what's the size of the entire dataset in MBs)

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

so probably the metadata was too large to fit... Any way to describe the metadata and its scope?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

78GB

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DrabOwl94
				
					0
					 × 1

so correct numbers are:

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DrabOwl94
				
					0
					 × 1

check the latest RC, it solved an issue with dataset uploading,
Let me check if it also solved this issue

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

chunksize: 512 Mb

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DrabOwl94
				
					0
					 × 1

500 chunks in total

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DrabOwl94
				
					0
					 × 1

Write your answer

2K Views

25 Answers

3 years ago

2 years ago