Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, Which Database Services Are Used To Store The Logged Data Such As Scalar, Text, Matrix, Etc? How Can I Query These For A Downstream Process Programmatically Instead Of Just Within The Web Ui? If Scalar Data Is Stored In Mongodb, Can I Use Pymongo To R

Hi, which database services are used to store the logged data such as scalar, text, matrix, etc? How can I query these for a downstream process programmatically instead of just within the web UI? If scalar data is stored in mongoDB, can I use pymongo to retrieve it?

  
  
Posted one year ago
Votes Newest

Answers 10


Hi SarcasticSparrow10

which database services are used to...

Mongo & Elastic
You can query everything using ClearML interface, or talk directly with the databases.
Full RestAPI is here:
https://clear.ml/docs/latest/docs/references/api/endpoints
You can use the APIClient for easier pythonic interface:
See example here
https://github.com/allegroai/clearml/blob/master/examples/services/cleanup/cleanup_service.py

What is the exact use case you have in mind?

  
  
Posted one year ago

Hi AgitatedDove14 Thanks, I'll check these out.

What is the exact use case you have in mind?

I want to store some additional data that is not relevant to training a model. For example, store inference results, explanations, etc and then use them in a different process. I currently use separate database for this.

Btw, I had been busy with another project and hadn't logged in here for some time. I see that you guys have made a lot of progress in the last two months! I'm excited to dig in πŸ™‚

  
  
Posted one year ago

Ok, I will look into artifacts. However, I will probably need high performance query functionality. For example, say I have a model and hundreds of thousands of inference records for that model. I want to be able to efficiently query that. My guess is that wouldn't be possible with artifacts. But that should be possible with Task.get_tasks .

  
  
Posted one year ago

I have a model and hundreds of thousands of inference records for that model.

What would be the query ? Are you reporting 100+ diff scalars ?

  
  
Posted one year ago

What would be the query ? Are you reporting 100+ diff scalars ?

At the moment I am not reporting any scalars related to inference. I'm only reporting data related to training a model. But I would like to report records that result from an inference process. For example the record would contain key_1, key_2, datetime, pred_1, pred_2 ... pred_n. I would have about 20 scalars if each of these fields is reported as a scalar.

The query can be a simple filtering criteria matching some keys or a more complex one which requires aggregation.

  
  
Posted one year ago

Ohh if this is the case, and this is a stream of constant inference Results, then yes, you should push it to some stream supported DB.
Simple SQL tables would work, but for actual scale I would push into a Kafka stream then pull it (serially) somewhere else and push into a DB

  
  
Posted one year ago

Got it. That makes sense. Thanks!

  
  
Posted one year ago

Also it might be better (although not necessary) to have a separate collection for storing inference results for better organization.

  
  
Posted one year ago

For example, store inference results, explanations, etc and then use them in a different process. I currently use separate database for this.

You can use artifacts for complex data then retrieve them programatically.
Or you can manually report scalers / plots etc, with Logger class, also you can retrive them with task.get_last_scalar_metrics

I see that you guys have made a lot of progress in the last two months! I'm excited to dig inΒ 

Thank you!

You can further dig with Task.get_tasks to get / filter / sort tasks based on any metric reported.

  
  
Posted one year ago

πŸ‘

  
  
Posted one year ago
90 Views
10 Answers
one year ago
4 months ago
Tags