Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
How Can I Log My Configuration Like This? I Have A Dict Params = {'Data':{'Data_Key':123}, 'Model':{'Model_Key':123}}, But It Become Data/Datakey Instead Of An Foldable Config. In Addition, I Don'T Want To Name It As "General", Where Can I Change It?

How can I log my configuration like this?
i have a dict params = {'data':{'data_key':123}, 'model':{'model_key':123}}, but it become data/datakey instead of an foldable config.

In addition, I don't want to name it as "General", where can I change it?

  
  
Posted 3 years ago
Votes Newest

Answers 16


diff line by line is probably not useful for my data config

You could request a better configuration diff feature 🙂 Feel free to add to GitHub

But this also mean I have to first load all the configuration to a dictionary first.

Yes 😞

  
  
Posted 3 years ago

using configuration directly it actually worse than using a dictionary for hyperparmaeters. It would do the diff line by line (notice the right experiment)

  
  
Posted 3 years ago

EnviousStarfish54 generally speaking the hyper parameters are flat key/value pairs. you can have as many sections as you like, but inside each section, key/value pairs. If you pass a nested dict, it will be stored as path/to/key:value (as you witnessed).
If you need to store a more complicated configuration dict (nesting, lists etc), use the connect_configuration, it will convert your dict to text (in HOCON format) and store that.
In both cases you can edit the configuration and then when running with the trains-agent, the code will have the values from the trains-server (instead of the values set in code), this is the "connect" idea.
Make sense ?

  
  
Posted 3 years ago

Hi EnviousStarfish54
I think this is what you are after
task.connect_configuration(my_dict_here, name='my_section_name')
BTW:
if you do task.connect(a_flat_dict, name='new section') you will have the key/value in a section name called "new section"

  
  
Posted 3 years ago

Yes EnviousStarfish54 the comparison is line by line and compared only to the left experiment (like any multi comparison, you have to set the baseline, which is always the left column here, do notice you can reorder the columns and the comparison will be updated)

  
  
Posted 3 years ago

I tried pass the dictionary but the output is not ideal. I would want to have some nested dict like the "execution" > "Source" layout.

As number of parameters can be large, having some hierarchy in the UI will be much easier for comparison

  
  
Posted 3 years ago

I am not sure what's the difference of logging with "configuration" and "hyperparameters", for now , I am only using it as logging, I guess hyperparmeters has special meaning if I want to use "trains" for some other features.

  
  
Posted 3 years ago

I use Yaml config for data and model. each of them would be a nested yaml (could be more than 2 layers), so it won't be a flexible solution and I need to manually flatten the dictionary

  
  
Posted 3 years ago

BTW: if you make the right column the base line (i.e. move it to the left, you will get what you probably expected)

  
  
Posted 3 years ago

In this case, I would rather use task.connect(), diff line by line is probably not useful for my data config. As shown in the example, shifting 1 line would result all remaining line different.

But this also mean I have to first load all the configuration to a dictionary first.

  
  
Posted 3 years ago

Thanks for your help. I will stick with task.connect() first. I have submit a Github Issue, thanks again AgitatedDove14

  
  
Posted 3 years ago

NICE!

  
  
Posted 3 years ago

image

  
  
Posted 3 years ago

https://github.com/quantumblacklabs/kedro-examples/blob/master/kedro-tutorial/conf/base/catalog.yml

I am actually using Kedro (a pipeline library), you can check out the yaml config here. There will be a lot of cases that I need to insert a new argument or dataset in between

  
  
Posted 3 years ago

If this is a simple two level nesting:
You can use the section name:
task.connect(param['data'], name='data') task.connect(param['model'], name='model')Would that help?
The comparison reflects the way the data is stored, in the configuration context. that means section name & key value (which is what the code above does)

  
  
Posted 3 years ago

I use Yaml config for data and model. each of them would be a nested yaml (could be more than 2 layers), so it won't be a flexible solution and I need to manually flatten the dictionary

Yes, you are correct, the recommended option would be to store it with task.connect_configuration it's goal is to store these types of configuration files/objects.
You can also store the yaml file itself directly just pass Path object instead of dict/string

  
  
Posted 3 years ago
562 Views
16 Answers
3 years ago
26 days ago
Tags