Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, I Try To Optimize My Hyperparamters With

Hi, I try to optimize my hyperparamters with HyperParameterOptimizer but I have to main problems.
It don't find my hyperparameters in my learning. I use Args/epochs for example. Moreover, in the logs, clearml-agent don't find my learning script (No such file or directory error) wheras the script exist and it is the right path. Could you help me ti fix my problems

  
  
Posted 3 years ago
Votes Newest

Answers 15


Hi ConvincingSwan15
A few background questions:

Where is the code that we want to optimize? Do you already have a Task of that code executed?

"find my learning script"

Could you elaborate ? is this connect to the first question ?

  
  
Posted 3 years ago

Hi AgitatedDove14
The code is on a private repo (clearml-agent is configure with ssh key and get the code correctly) Otherwise I run the code directly on my computer. The code was previously ran in a task and the task seems to be correctly loaded. I get the right id from the get_task function.

When the optimizer try to run the first batch of hyperparameter I get this error message in the log /home/local/user/.clearml/venvs-builds/3.7/bin/python: can't open file 'train.py': [Errno 2] No such file or directory

  
  
Posted 3 years ago

The log of my optimizer looks like this:
Task: {'template_task_id': '6f3bf2ecbb964ff3b2a6111c34cb0fa3', 'run_as_service': False} 2021-03-30 10:45:25,413 - trains.automation.optimization - WARNING - Could not find requested hyper-parameters ['Args/patch_size', 'Args/nb_conv', 'Args/nb_fmaps', 'Args/epochs'] on base task 6f3bf2ecbb964ff3b2a6111c34cb0fa3 2021-03-30 10:45:25,433 - trains.automation.optimization - WARNING - Could not find requested metric ('dice', 'dice') report on base task 6f3bf2ecbb964ff3b2a6111c34cb0fa3 Progress report #0 completed, sleeping for 0.25 minutes 2021-03-30 10:45:25,639 - trains.automation.optimization - INFO - Creating new Task: {'Args/patch_size': 32, 'Args/nb_conv': 2, 'Args/nb_fmaps': 30, 'Args/epochs': 30}

  
  
Posted 3 years ago

Hmm ConvincingSwan15

WARNING - Could not find requested hyper-parameters ['Args/patch_size', 'Args/nb_conv', 'Args/nb_fmaps', 'Args/epochs'] on base task

Is this correct ? Can you see these arguments on the original Task in the UI (i.e. Args section, parameter epochs?)

  
  
Posted 3 years ago

Yes and I double check in python and I get the dictionnary with: Args/...

  
  
Posted 3 years ago

Okay, so I think it doesn't find the correct Task, otherwise it wouldn't print the warning,
How do you setup the HPO class ? Could you copy paste the code?

  
  
Posted 3 years ago

an_optimizer = HyperParameterOptimizer( base_task_id="6f3bf2ecbb964ff3b2a6111c34cb0fa3", hyper_parameters=[ DiscreteParameterRange('Args/patch_size', values=[32, 64, 128]), DiscreteParameterRange('Args/nb_conv', values=[2, 3, 4]), DiscreteParameterRange('Args/nb_fmaps', values=[30, 35, 40]), DiscreteParameterRange('Args/epochs', values=[30]), ], objective_metric_title='valid_average_dice_epoch', objective_metric_series='valid_average_dice_epoch', objective_metric_sign='max', max_number_of_concurrent_tasks=1, optimizer_class=GridSearch, execution_queue="default", spawn_project=None, save_top_k_tasks_only=None, pool_period_min=0.2, total_max_jobs=1, min_iteration_per_job=10, max_iteration_per_job=30, )

  
  
Posted 3 years ago

I double check the id and it is the right one

  
  
Posted 3 years ago

Hmm, maybe the original Task was executed with older versions? (before the section names were introduced)
Let's try:
DiscreteParameterRange('epochs', values=[30]),Does that gives a warning ?

  
  
Posted 3 years ago

BTW

/home/local/user/.clearml/venvs-builds/3.7/bin/python: can't open file 'train.py': [Errno 2] No such file or directory

This error is from the agent, correct? it seems it did not clone the correct code, is train.py committed to the repository ?

  
  
Posted 3 years ago

It was run with the exact same version. And I got the same message with "epochs" only.

  
  
Posted 3 years ago

For the train.py do I need a setup.py file in my repo to work corerctly with the agent ? For now it is just the path to train,py

  
  
Posted 3 years ago

Ok so I installed the last version of clearml and the hyperparameters are found now

  
  
Posted 3 years ago

The only thing to patch is the train.py issue

  
  
Posted 3 years ago

Hi ConvincingSwan15

For the train.py do I need a setup.py file in my repo to work corerctly with the agent ? For now it is just the path to train,py

I'm assuming the train.py is part of the repository, no?
If it is, how come the agent after cloning the repository cannot find it ?
Could it be it was accidentally not added to the git repo ?

  
  
Posted 3 years ago
876 Views
15 Answers
3 years ago
one year ago
Tags