Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi! Trying To Create A Dataset (We’Re Running It On An On-Prem Clearml Server), I Run The Following Command:

Hi!
trying to create a dataset (we’re running it on an on-prem clearml server), I run the following command:
clearml-data add --files data_20210613/and get the following error response:
`
clearml-data - Dataset Management & Versioning CLI
Adding files/folder to dataset id cc5be76bf29a42f694eed5caadf7d50d
Generating SHA2 hash for 56875 files
Hash generation completed
2021-07-18 09:19:22,655 - clearml.storage - ERROR - Exception encountered while uploading Failed uploading object files.clearml.<...>/artifacts/state/state.json (413): <html>

<head><title>413 Request Entity Too Large</title></head> <body> <center><h1>413 Request Entity Too Large</h1></center> <hr><center>nginx/1.19.9</center> </body> </html>

Error: Failed uploading object files.clearml.<...>/artifacts/state/state.json (413): <html>

<head><title>413 Request Entity Too Large</title></head> <body> <center><h1>413 Request Entity Too Large</h1></center> <hr><center>nginx/1.19.9</center> </body> </html> `What do I do wrong?

Thx!

  
  
Posted 3 years ago
Votes Newest

Answers 9


Well, you'll need to configure the default output_uri to be an s3 bucket

  
  
Posted 3 years ago

Hey, it seems the dataset is simply too large to upload to the fileserver... How big is the dataset?

  
  
Posted 3 years ago

Well, in your clearml.conf file, set the sdk.development.default_output_uri to the desired value (see https://github.com/allegroai/clearml/blob/fb6fd9ac4a6820b4d1d3b8d6dcc60208a45d0718/docs/clearml.conf#L163 )

  
  
Posted 3 years ago

Well, I couldn’t find where to add the output_uri .
I tried now the following:
` import clearml
local_path='<some-local-path>'
s3_path = 's3://<some-bucket-path>'

dataset = clearml.Dataset.create(dataset_project='project_name', dataset_name='trial_01')

dataset.add_files(path=local_path, dataset_path=s3_path) `but I don’t see the files on the s3 bucket.

I also tried this:
dataset.sync_folder(local_path=local_path, dataset_path=s3_path)and still no success. It seems like it uploading the files to the clearml server:
>> dataset.get_default_storage() ' '
It would be great if you could help me understand how to direct the dataset to upload the files to the s3_path.

  
  
Posted 3 years ago

Thx! will try it tomorrow.

  
  
Posted 3 years ago

It’s a big dataset/.

  
  
Posted 3 years ago

I want to upload the dataset into s3. is there a flag that tells it to do so?

  
  
Posted 3 years ago

Thx!

  
  
Posted 3 years ago