Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
I Am Currently Running A Clearml-Server And Have Few Questions About Dataset Management.

I am currently running a Clearml-server and have few questions about Dataset management.
https://clear.ml/docs/latest/docs/getting_started/ds/best_practices#train-remotely This is the situation I am currently in where i would like to develop locally and train remotely. What i have done so far is that, i have a custom_dataset_large(lets assume that the original dataset is 200Gb) and a mini version of it(less than 5Gb). I have the custom_dataset_large uploaded into the file_server via clearml-data and i have a dataset id for it. The dataset comprises of a bunch of directories which inturn contain data upon which a model needs to be trained. How can i parse the custom_dataset_large without downloading a local cache of it. cause the only way i see it is via Dataset.get_local_copy() method, which i dont want to do as it download the complete dataset into the local dev machine which doesnt have that much storage. What i am looking for is, how to fetch the remote dataset url path which I uploaded via cleaml-data to.

  
  
Posted 3 years ago
Votes Newest

Answers 4


I think the only way you can get it is from the task attribute:

ds = Dataset.get(dataset_id="your dataset id") ds_uri = ds._task.artifacts.get("data").url

  
  
Posted 3 years ago

yes but for the dataset located in the server, so that i can parse them like a normal local copy

  
  
Posted 3 years ago

seems to work thanks. But its not as handy as .get_local_copy() method. I will try to raise a feature request. Since this again returns a .zip path. I would like to received a local path which is easily parsable like the method describe above.

  
  
Posted 3 years ago

Hi BitterLeopard33 ,

You want to have the data section in the dataset task uri?

  
  
Posted 3 years ago
1K Views
4 Answers
3 years ago
one year ago
Tags
Similar posts