NICE! CurvedHedgehog15 cool stuff! and my pleasure 🙂
Hello again, AgitatedDove14 and others. I write you to let you know, what works for us in the case of optimizing hparams in DictConfig:
` hparams_dict = OmegaConf.to_object(hparams)
update hparams_dict using new hyperparameters set by the optimizer
hparams_dict = task.connect(hparams_dict, name="HPO")
ProxyDictPostWrite to dict
hparams_dict = hparams_dict._to_dict()
update hparams DictConfig which is used later in the training
hparams = OmegaConf.create(hparams_dict)
train(hparams) `Using this notation, you can set the hyperparameters in the optimization example like this:
hyper_parameters=[ DiscreteParameterRange("HPO/training/seed", values=[100, 900]), ],
You can compare the hyperparameters of the experiments in the Configuration tab -> Hyperparameters -> HPO.
I hope this dummy example will help somebody in the future. Thank you AgitatedDove14 for cooperation.
I think the update back the hparam can be the solution for me. Just to be sure if we mean the same:
` @hydra.main(config_path="solver/config", config_name="config")
def train(hparams: DictConfig):
task = Task.init(hparams.task_name, hparams.tag)
overrides = {'my_param': hparams.value} # dict
task.connect(overrides, name='overrides')
<update hparams according to overrides and use it further in the code>
overrides are changed because of optimization, thus hparams will be also changed `
But thanks to you I realized one thing: I useÂ
hparams
 further in the code, notÂ
normalize_and_flat_config(hparams)
 .
This is the main issue , any reason not to use normalize_and_flat_config(hparams)
later in the code?
or maybe update back the hparam?
I actually think I do this. Purpose of normalize_and_flat_config
is just to take hparams DictConfig with possibly nested structure and flatten it to dict with direct key value dict. For instance:{ 'model' : { 'class' : Resnet, 'input_size' : [112, 112, 3] } }
is simplified to{ 'model.class' : Resnet, 'model.input_size' : [112, 112, 3] }
so in my case normalize_and_flat_config(hparams)
is actually your overrides
. And as you suggest I tried to remove Task.connect_configuration
and just keep the Task.connect
. But thanks to you I realized one thing: I use hparams
further in the code, not normalize_and_flat_config(hparams)
. So, viewing it in types, I am using DictConfig
and not dict
and that might be the problem. Wdyt?
assuming you have http://hparams.my _param
my suggestion is:
` @hydra.main(config_path="solver/config", config_name="config")
def train(hparams: DictConfig):
task = Task.init(hparams.task_name, hparams.tag)
overrides = {'my_param': hparams.value}
task.connect(overrides, name='overrides')
in remote this will print the value we put in "overrides/my_param"
print(overrides['my_param'])
now we actually use overrides['my_param'] `Make sense ?
Yes, it is logged using some default name "OmegaConf". But still the hparams are taken from here, not from the Hyperparameters section
Oh, it is possible, I have never tried. I will.
CurvedHedgehog15 there is not need for :task.connect_configuration( configuration=normalize_and_flat_config(hparams), name="Hyperparameters", )
Hydra is automatically logged for you, no?!
AgitatedDove14 Hmm, every training is run by bash script calling train.py
which looks something like this:
` @hydra.main(config_path="solver/config", config_name="config")
def train(hparams: DictConfig):
"""
Run training pytorch-lightning model
"""
# Set process title
setproctitle.setproctitle(f"{hparams.tag}-{get_user_name()}")
try:
# Init ClearmlTask and connect configuration
task = Task.init(hparams.task_name, hparams.tag)
task.connect_configuration(
configuration=normalize_and_flat_config(hparams),
name="Hyperparameters",
)
task.connect(
normalize_and_flat_config(hparams), name="Hyperparameters_for_optimization"
)
hparams.clearml_task_id = task.id
<another preparation for training using hparams> `Interestingly, "Hyperparameters for optimization" are overwritten correctly, but "Hyperparameters" aren't even though I tried to set Hydra/_ allow_ omegaconf_edit_ to true. So probably I have wrong logic of the program incompatible with optimizer. I supposed the optimizer can do something similar like Hydra overrides ( https://hydra.cc/docs/advanced/override_grammar/basic/ ) internally, but for my case it would be probably easier to use this hydra overrides directly in the code. If you have some other notes and ideas, I would be glad to read them after Christmas, otherwise thank you very much for your participation 🙂
I figured out the problem...
Nice!
Unfortunately, the hyperparameters in configuration object seems to be superior to the hyperparameters in Hyperparameter section
Hmm what do you mean by that ? how did you construct the code itself? (you should be able to "prioritize" one over the over)
AgitatedDove14 Unfortunately, the hyperparameters in configuration object seems to be superior to the hyperparameters in Hyperparameter section, at least in my case. Probably I will try to get rid of OmegaConf configuration, copy it to the Hyperparameter section and we will see
AgitatedDove14 I figured out the problem. I ran the dummy training (Task.init, training script, logging to ClearML) just "locally", thus the optimization task did not know the desired environment (Git repo, docker, etc.). I had to submit the task using clearml-task
and then the optimization tasks did not fail.
CurvedHedgehog15 the agent has two modes of opration:
single script file (or jupyter notebook), where the Task stores the entire file on the Task itself. multiple files, which is only supported if you are working inside a git repository (basically the Task stores a refrence to the git repository and the agent pulls it from the git repo)Seems you are missing the git repo, could that be?
AgitatedDove14 I trained a dummy training with task.connect
to have the hparams in Hyperparameters section, I ran the HyperParameterOptimizer
and really the "hpo_params/hparam" is updated, however, the training failes ( solver
is child folder with various configs in my repo):
File "/root/.clearml/venvs-builds/3.6/code/train.py", line 17, in <module> from solver.config.utils import ( ModuleNotFoundError: No module named 'solver'
The quickest workaround would be, In your final code just do something like:my_params_for_hpo = {'key': omegaconf.key} task.connect(my_params_for_hpo, name='hpo_params') call_training_with_value(my_params_for_hpo['key'])
This will initialize the my_params_for_hpo
with the values from OmegaConf, and allow you to override them in the hyperparameyter section (task.connect is two, in manual it stores the data on the Task, in agent mode, it takes the values from the Task and puts them back to the dict)
AgitatedDove14 Yes, allowing users to modify the configuration_object would be great 🙂 ... Well, I will try at least copy the OmegaConf object to hyperparameters section and we will see in the moment if the "quickest way" is the workaround
Are hparms saved in hypeparameter section superior to hparams saved in configuration objects?
well I'm not sure about "superior" but they are structured, as opposed to configuration object, which is as generic as could be
Can you provide some further explanation, please? Sorry, I am beginner.
My bad, I was thinking out loud on improving the HPO process and allowing users to modify the configuration_object , not just the hyperparameters
CostlyOstrich36 dict containing OmegaConf dict {'OmegaConf': "task_name: age\njira_task: IMAGE-2536\n ...} But the better option is get_configuration_object_as_dict("Hyperparameters")
I think.
AgitatedDove14 Are hparms saved in hypeparameter section superior to hparams saved in configuration objects?
Regarding to the callback, I am not really sure, how exactly it is meant. I follow the implementation of HyperParameterOptimizer
, but I have no idea where can I place such a thing. Can you provide some further explanation, please? Sorry, I am beginner.
What do you get when you call get_configuration_objects()
now?
Hi CurvedHedgehog15
I would like to optimize hparams saved in Configuration objects.
Yes, this is a tough one.
Basically the easiest way to optimize is with hyperparameter sections as they are basically key/value you can control from the outside (see the HPO process)
Configuration objects are, well, blobs of data, that "someone" can parse. There is no real restriction on them, since there are many standards to store them (yaml,json.init, dot notation etc.)
The quickest way is to add another "hyperparamter section" to the code, and connect it then override the configuration:my_params_for_hpo = {'key': 1234} task.connect(my_params_for_hpo, name='hpo_params')
Then we can use "hpo_params/key" to externally control the values.
Regardless, I think it makes sense to have a callback to "set parameters" so that the HPO class will allow you to override the way it sets the hyperparameters, this will allow us to very easily change the configuration object, regardless of their format.
wdyt?
Regarding the get_configuration_objects()
I realized I query config for the optimization task, not for the base task, sorry. But the former question about the hparam name setting is still interesting!