Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, I'M Running

Hi,
I'm running clearml==1.8.4rc0
When creating a Dataset with the following code - the dataset that is created includes ALL the files within the bucket and not only the files that are within the folder
bucket_path=" " ds = Dataset.create( dataset_project=dataset_project, dataset_name=dataset_name, ) ds.add_external_files(source_url=bucket_path ) ds.finalize(auto_upload=True)Is this the desired behavior? What would be the best practice for create Datasets from Buckets? especially regarding the finalising , uploading/downloading and streaming the data

  
  
Posted 2 years ago
Votes Newest

Answers 6


hi OutrageousSheep60 ! We didn't release an RC yet, we will a bit later today tho. We will ping you when it's ready, sorry for the delay

  
  
Posted 2 years ago

SmugDolphin23 Where can I check the lates RC? I was not able to find it in the clearml github repo

  
  
Posted 2 years ago

OutrageousSheep60 1.8.4rc1 is out. Can you please try it? pip install -U clearml==1.8.4rc1

  
  
Posted 2 years ago

SmugDolphin23 I'll be happy to checkout the RC once it is fixed

  
  
Posted 2 years ago

THx SmugDolphin23 - looks OK

  
  
Posted 2 years ago

Hi OutrageousSheep60 ! Regarding your questions:
No it's not. We will have a RC that fixes that ASAP, hopefully by tomorrow You can use add_external_files which you already do. If you wish to upload local files to the bucket, you can specify the output_url of the dataset to point the bucket you wish to upload the data to. See the parameter here: https://clear.ml/docs/latest/docs/references/sdk/dataset/#upload . Note that you CAN mix external_files and regular files. We don't have streaming capabilities in the datasets, but you could "partition" your download. Check the part and num_parts parameters here https://clear.ml/docs/latest/docs/references/sdk/dataset/#get_local_copy .

  
  
Posted 2 years ago