Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, I Have A Question About

Hi, I have a question about clearml-data .
It looks the CLI remembers "previously created/accessed dataset".
Where is that information saved?
Or how can I retrieve that information other than parsing the console output?

  
  
Posted 3 years ago
Votes Newest

Answers 9


Hi AgitatedDove14
Thanks, that is it!

Yeah, I have noticed the --id option.
What I wanted is to automate making dataset from some set of files.
And it requires the dataset id after running clearml-data create ... .
Reading ~/.clearml_data.json looks much better than parsing the command output.

  
  
Posted 3 years ago

Hi SoggyFrog26
Yes, it is stored at ~/.clearml_data.json
Notice you can always change it by passing --id dataset_id

  
  
Posted 3 years ago

Well, yeah, it would be cleaner if we could go fully python.
But our system is already built and running, and now we are planning to add some training functionality.
The training part can be written in Python but the sample collecting part will be deeply connected to the existing system which is not written in python.
For now using CLI looks much reasonable for that part.

  
  
Posted 3 years ago

I think it would be nicer if the CLI had a subcommand to show the content of 

~/.clearml_data.json

 .

Actually, it only stores the last dataset id at the moment, no not much 🙂
But maybe we should have a cmd line that just outputs the current datasetid, this means it will be easier to grab and pipe
WDYT?

  
  
Posted 3 years ago

But maybe we should have a cmd line that just outputs the current datasetid, this means it will be easier to grab and pipe

That sounds good.
It definitely helps!

  
  
Posted 3 years ago

SoggyFrog26 there is a full pythonic interface, why don't you use this one instead, much cleaner 🙂

  
  
Posted 3 years ago

SoggyFrog26 you'll have it in the next RC 🙂
Not sure what's the plan I know one should be out today/tomorrow, worst case on the next one 🙂

  
  
Posted 3 years ago

Sounds good, thanks!

  
  
Posted 3 years ago

I think it would be nicer if the CLI had a subcommand to show the content of ~/.clearml_data.json .
In that way, users can be more confident to query the dataset id on which the CLI currently focusing.
My scripts will keep working when the CLI changed how to store the dataset id in the future.

  
  
Posted 3 years ago
538 Views
9 Answers
3 years ago
one year ago
Tags