Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, I'M Running

Hi,
I'm running clearml==1.8.4rc0
When creating a Dataset with the following code - the dataset that is created includes ALL the files within the bucket and not only the files that are within the folder
bucket_path=" " ds = Dataset.create( dataset_project=dataset_project, dataset_name=dataset_name, ) ds.add_external_files(source_url=bucket_path ) ds.finalize(auto_upload=True)Is this the desired behavior? What would be the best practice for create Datasets from Buckets? especially regarding the finalising , uploading/downloading and streaming the data

  
  
Posted one year ago
Votes Newest

Answers 6


SmugDolphin23 Where can I check the lates RC? I was not able to find it in the clearml github repo

  
  
Posted one year ago

SmugDolphin23 I'll be happy to checkout the RC once it is fixed

  
  
Posted one year ago

THx SmugDolphin23 - looks OK

  
  
Posted one year ago

hi OutrageousSheep60 ! We didn't release an RC yet, we will a bit later today tho. We will ping you when it's ready, sorry for the delay

  
  
Posted one year ago

OutrageousSheep60 1.8.4rc1 is out. Can you please try it? pip install -U clearml==1.8.4rc1

  
  
Posted one year ago

Hi OutrageousSheep60 ! Regarding your questions:
No it's not. We will have a RC that fixes that ASAP, hopefully by tomorrow You can use add_external_files which you already do. If you wish to upload local files to the bucket, you can specify the output_url of the dataset to point the bucket you wish to upload the data to. See the parameter here: https://clear.ml/docs/latest/docs/references/sdk/dataset/#upload . Note that you CAN mix external_files and regular files. We don't have streaming capabilities in the datasets, but you could "partition" your download. Check the part and num_parts parameters here https://clear.ml/docs/latest/docs/references/sdk/dataset/#get_local_copy .

  
  
Posted one year ago