How Can I Log My Configuration Like This? I Have A Dict Params = {'Data':{'Data_Key':123}, 'Model':{'Model_Key':123}}, But It Become Data/Datakey Instead Of An Foldable Config. In Addition, I Don'T Want To Name It As "General", Where Can I Change It?

Answered

How can I log my configuration like this?
i have a dict params = {'data':{'data_key':123}, 'model':{'model_key':123}}, but it become data/datakey instead of an foldable config.

In addition, I don't want to name it as "General", where can I change it?

  				
Posted 
	4 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

Votes Newest

Answers 16

BTW: if you make the right column the base line (i.e. move it to the left, you will get what you probably expected)

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

EnviousStarfish54 generally speaking the hyper parameters are flat key/value pairs. you can have as many sections as you like, but inside each section, key/value pairs. If you pass a nested dict, it will be stored as path/to/key:value (as you witnessed).
If you need to store a more complicated configuration dict (nesting, lists etc), use the connect_configuration, it will convert your dict to text (in HOCON format) and store that.
In both cases you can edit the configuration and then when running with the trains-agent, the code will have the values from the trains-server (instead of the values set in code), this is the "connect" idea.
Make sense ?

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I am not sure what's the difference of logging with "configuration" and "hyperparameters", for now , I am only using it as logging, I guess hyperparmeters has special meaning if I want to use "trains" for some other features.

  				
Posted 
	4 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

Thanks for your help. I will stick with task.connect() first. I have submit a Github Issue, thanks again AgitatedDove14

  				
Posted 
	4 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

Yes EnviousStarfish54 the comparison is line by line and compared only to the left experiment (like any multi comparison, you have to set the baseline, which is always the left column here, do notice you can reorder the columns and the comparison will be updated)

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

  				
Posted 
	4 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

In this case, I would rather use task.connect(), diff line by line is probably not useful for my data config. As shown in the example, shifting 1 line would result all remaining line different.

But this also mean I have to first load all the configuration to a dictionary first.

  				
Posted 
	4 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

https://github.com/quantumblacklabs/kedro-examples/blob/master/kedro-tutorial/conf/base/catalog.yml

I am actually using Kedro (a pipeline library), you can check out the yaml config here. There will be a lot of cases that I need to insert a new argument or dataset in between

  				
Posted 
	4 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

NICE!

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

diff line by line is probably not useful for my data config

You could request a better configuration diff feature 🙂 Feel free to add to GitHub

But this also mean I have to first load all the configuration to a dictionary first.

Yes 😞

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

using configuration directly it actually worse than using a dictionary for hyperparmaeters. It would do the diff line by line (notice the right experiment)

  				
Posted 
	4 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

If this is a simple two level nesting:
You can use the section name:
task.connect(param['data'], name='data') task.connect(param['model'], name='model')Would that help?
The comparison reflects the way the data is stored, in the configuration context. that means section name & key value (which is what the code above does)

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hi EnviousStarfish54
I think this is what you are after
task.connect_configuration(my_dict_here, name='my_section_name')
BTW:
if you do task.connect(a_flat_dict, name='new section') you will have the key/value in a section name called "new section"

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I tried pass the dictionary but the output is not ideal. I would want to have some nested dict like the "execution" > "Source" layout.

As number of parameters can be large, having some hierarchy in the UI will be much easier for comparison

  				
Posted 
	4 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

I use Yaml config for data and model. each of them would be a nested yaml (could be more than 2 layers), so it won't be a flexible solution and I need to manually flatten the dictionary

Yes, you are correct, the recommended option would be to store it with task.connect_configuration it's goal is to store these types of configuration files/objects.
You can also store the yaml file itself directly just pass Path object instead of dict/string

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I use Yaml config for data and model. each of them would be a nested yaml (could be more than 2 layers), so it won't be a flexible solution and I need to manually flatten the dictionary

  				
Posted 
	4 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

Write your answer

1K Views

16 Answers

4 years ago

11 months ago