Reputation
Badges 1
26 × Eureka!Thank you ExasperatedCrab78 for your reply, as you said, I still miss the overview I am looking for, so I made an issue as you suggest 🙂
https://github.com/allegroai/clearml/issues/760#issue-1355778280
Oh, sorry, I wrongly understand your issue. 😞
But it is the interesting one!
What comes to my mind, that https://clear.ml/docs/latest/docs/references/sdk/hpo_parameters_parameterset#class-automationparameterset can be required when there is a link between two variables. But I have never tested it and I am not ClearML developer, so do not take this advice too seriously. 🙂 Hoperfully, someone more ClearML experienced will respond you.
Hello CostlyOstrich36 , thank you, eventually I will try it. But I was looking for a safer way to do it if anything goes wrong.
AgitatedDove14 so if I undestand it correctly, the parameters such as time_limit_per_job, max_iteration_per_job, etc. can be surpassed by internal processes in Optuna and so on, right? I observe this behaviour also in the case of RandomSearch, does it stop the experiments either? And as I wrote, the first two spawned tasks were aborted using this message, this is weird, isn't it? I mean that HPO stops the tasks with early stop even though no previous tasks (benchmarks) are known.
Oh, damn, you're right CostlyOstrich36 , this make sense. And really AgitatedDove14 if I look at the objective, it seems that tasks with the objective far from the base task are aborted. Thank you very much guys.
Hi, CostlyOstrich36 , yes 🙂
Hi AgitatedDove14 , thank you for your response
Yes, firstly I was thinking about the option 2, but then I saw one case in our experiments where the ui merges the plots just as we want and I was wondering if there is some simple way to do it in the case of all plots. In my opinion, for our use case option 1 is also fine - how can I combine two plots in the ui as you mentioned?
AgitatedDove14 Thank you very much for your advice and explanation.
I actually think I do this. Purpose of normalize_and_flat_config
is just to take hparams DictConfig with possibly nested structure and flatten it to dict with direct key value dict. For instance:{ 'model' : { 'class' : Resnet, 'input_size' : [112, 112, 3] } }
is simplified to{ 'model.class' : Resnet, 'model.input_size' : [112, 112, 3] }
so in my case normalize_and_flat_config(hparams)
is actually your overrides
. And as you suggest I tried to remove ` Task.connect_c...
I think the update back the hparam can be the solution for me. Just to be sure if we mean the same:
` @hydra.main(config_path="solver/config", config_name="config")
def train(hparams: DictConfig):
task = Task.init(hparams.task_name, hparams.tag)
overrides = {'my_param': hparams.value} # dict
task.connect(overrides, name='overrides')
<update hparams according to overrides and use it further in the code>
overrides are changed because of optimization, thus hparams will be also changed `
Yes, it is logged using some default name "OmegaConf". But still the hparams are taken from here, not from the Hyperparameters section
AgitatedDove14 I trained a dummy training with task.connect
to have the hparams in Hyperparameters section, I ran the HyperParameterOptimizer
and really the "hpo_params/hparam" is updated, however, the training failes ( solver
is child folder with various configs in my repo):
File "/root/.clearml/venvs-builds/3.6/code/train.py", line 17, in <module> from solver.config.utils import ( ModuleNotFoundError: No module named 'solver'
Regarding the get_configuration_objects()
I realized I query config for the optimization task, not for the base task, sorry. But the former question about the hparam name setting is still interesting!
AgitatedDove14 Unfortunately, the hyperparameters in configuration object seems to be superior to the hyperparameters in Hyperparameter section, at least in my case. Probably I will try to get rid of OmegaConf configuration, copy it to the Hyperparameter section and we will see
CostlyOstrich36 dict containing OmegaConf dict {'OmegaConf': "task_name: age\njira_task: IMAGE-2536\n ...} But the better option is get_configuration_object_as_dict("Hyperparameters")
I think.
AgitatedDove14 Are hparms saved in hypeparameter section superior to hparams saved in configuration objects?
Regarding to the callback, I am not really sure, how exactly it is meant. I follow the implementation of HyperParameterOptimizer
, but I have no idea where can I place such a thing. Can you provide some further explanation, please? Sorry, I am beginner.
Oh, it is possible, I have never tried. I will.
AgitatedDove14 Hmm, every training is run by bash script calling train.py
which looks something like this:
` @hydra.main(config_path="solver/config", config_name="config")
def train(hparams: DictConfig):
"""
Run training pytorch-lightning model
"""
# Set process title
setproctitle.setproctitle(f"{hparams.tag}-{get_user_name()}")
try:
# Init ClearmlTask and connect configuration
task = Task.init(hparams.task_name, hparams.tag)
task.conne...
AgitatedDove14 Yes, allowing users to modify the configuration_object would be great 🙂 ... Well, I will try at least copy the OmegaConf object to hyperparameters section and we will see in the moment if the "quickest way" is the workaround
Hello again, AgitatedDove14 and others. I write you to let you know, what works for us in the case of optimizing hparams in DictConfig:
` hparams_dict = OmegaConf.to_object(hparams)
update hparams_dict using new hyperparameters set by the optimizer
hparams_dict = task.connect(hparams_dict, name="HPO")
ProxyDictPostWrite to dict
hparams_dict = hparams_dict._to_dict()
update hparams DictConfig which is used later in the training
hparams = OmegaConf.create(hparams_dict)
train(hparams...
AgitatedDove14 I figured out the problem. I ran the dummy training (Task.init, training script, logging to ClearML) just "locally", thus the optimization task did not know the desired environment (Git repo, docker, etc.). I had to submit the task using clearml-task
and then the optimization tasks did not fail.
SuccessfulKoala55 AgitatedDove14 Perfect, you are both awesome, I will try. Thanks 🙂
Hi ExasperatedCrab78 , thank you for your response! I am not sure if I understand you right, can you provide some dummy example, please?
What we already tried is reporting scalar for individual FAR levels, i.e. 0.001, 0.002, 0.01, etc. But this is not really good for us as we loose overall view on the performance by comparing multiple scalars on separate places. 😞
Seems interesting, I will give a try, thank you ExasperatedCrab78 !
Hi, CostlyOstrich36 , thank you for your response.
I realised my issue happens when I compare hyperparameters connected by task.connect_configuration
. I compared them in DETAILS section (see screenshot)
When I connect them using task.connect
I am able to compare them in HYPER PARAMETERS section, which works as I supposed.
So issue solved 🙂
CostlyOstrich36 we have quite difficult structure of code, so I can't just copy and paste, I would have to make some dummy code snippet. If you have already some, please send it and I can just fill it with the logging function.
Hi,
I encountered similar problem. The solution was quite difficult to find, but finally we managed to update our HPO section with hyperparameters like this: https://clearml.slack.com/archives/CTK20V944/p1641372134329200?thread_ts=1640010570.080900&cid=CTK20V944
Hope this helps.