Answered

Cannot Upload A Dataset With A Parent - Seems Very Odd! Clearml Versions I Tried: 1.6.1, 1.6.2 Scenario: * Create Parent Dataset (With Storage On S3) * Upload Data * Close Dataset * Create Child Dataset (Tried With Storage On Both S3 Or On Clearml Serv

Cannot upload a dataset with a parent - seems very odd!

clearml versions I tried: 1.6.1, 1.6.2

scenario:

Create parent dataset (with storage on S3)
Upload data
Close dataset
Create child dataset (tried with storage on both S3 or on clearml server)
add single file, or folder to child
close child
Get exception (see below)

clearml-data - Dataset Management & Versioning CLI Finalizing dataset id d80b190d84ca41e1b139c841427dd241 id=d80b190d84ca41e1b139c841427dd241 disable_upload=False chunk_size=512 2022-08-09 07:01:54,819 - clearml.storage - INFO - Downloading: 5.00MB / 5.92MB @ 29.85MBs from 2022-08-09 07:01:54,825 - clearml.storage - INFO - Downloaded 5.92 MB successfully from , saved to /home/ec2-user/.clearml/cache/storage_manager/datasets/2ff81b56341faaaad7796344472ec8d2.state.json Pending uploads, starting dataset upload to Compressing /home/ec2-user/xxx/yyy/zzz.npy Uploading dataset changes (1 files compressed to 1.67 MiB) to `
File compression and upload completed: total size 1.67 MiB, 1 chunked stored (average size 1.67 MiB)

Error: unsupported operand type(s) for +=: 'int' and 'NoneType' `
Any idea? this seems like a really basic scenario, I am sure it worked for me in the past

  				
Posted 
	2 years ago

					More  		
  Report
		
					RoughTiger69
				
					0
					 × 1

Votes Newest

Answers 8

Tried with 1.6.0, doesn’t work

#this is the parent clearml-data create --project xxx --name yyy --output-uri `
clearml-data add folder1
clearml-data close

#this is the child, where XYZ is the parent's id
clearml-data create --project xxx --name yyy1 --parents XYZ --output-uri
clearml-data add folder2
clearml-data close
#now I get the error above `

  				
Posted 
	2 years ago

					More  		
  Report
		
					RoughTiger69
				
					0
					 × 1

It seems to work fine when the parent is on clear.ml storage (tried with toy example of data)

  				
Posted 
	2 years ago

					More  		
  Report
		
					RoughTiger69
				
					0
					 × 1

Can you try it with clearml==1.6.0 please?
Also, can you list the exact commands you ran?

  				
Posted 
	2 years ago

					More  		
  Report
		
					CostlyOstrich36
				
					0

no, I tried either with very small files or with 20GB as the parent

  				
Posted 
	2 years ago

					More  		
  Report
		
					RoughTiger69
				
					0
					 × 1

quick update, still trying to reproduce ...

  				
Posted 
	2 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hi RoughTiger69 ! Can you try adding the files using a python script such that we could get an exception traceback, something like this:
` from clearml import Dataset

or just use the ID of the dataset you previously created instead of creating a new one

parent_dataset = Dataset.create(dataset_name="xxxx", dataset_project="yyyyy", output_uri=" ")
parent_dataset.add_files("folder1")
parent_dataset.upload()
parent_dataset.finalize()

child_dataset = Dataset.create(dataset_name="xxxx", dataset_project="yyyyy", output_uri=" ", parent_datasets=[parent_dataset.id]) # or just use the ID of the dataset you previously created
child_dataset.add_files("folder2")
child_dataset.upload()
child_dataset.finalize() `Also, how many files are in the parent dataset?
Thanks

  				
Posted 
	2 years ago

					More  		
  Report
		
					SmugDolphin23
				
					0

I tested it again with much smaller data and it seems to work.
I am not sure what is the difference between the use-cases. it seems like something specifically about the particular (big) parent doesn’t agree with clearml…

  				
Posted 
	2 years ago

					More  		
  Report
		
					RoughTiger69
				
					0
					 × 1

RoughTiger69 , do you have a rough estimate on the size that breaks it?

  				
Posted 
	2 years ago

					More  		
  Report
		
					CostlyOstrich36
				
					0

Write your answer

2K Views

8 Answers

2 years ago