Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello Guys, I Am Using Clearml Server To Run Hyperparameter Optimization. When Running It, Sometimes This Error Happens, But When Running Again The Same Code Runs Smoothly. Sometimes It Works And Sometimes Not. It Seems That Some Of The Base Taks Of The

Hello guys,
I am using clearml server to run hyperparameter optimization. When running it, sometimes this error happens, but when running again the same code runs smoothly. Sometimes it works and sometimes not. It seems that some of the base taks of the optimization is not logging the optimization metric, but the code that generate that is always the same... Checking in the base tasks some of them are indeed without the metric title and series. Would that be the error? How to ensure that the metric is always being logged?

[W 2025-04-08 21:37:59,248] Trial 1 failed with parameters: {'model_params/a': 185, 'model_params/b': 190835188.187596, 'model_params/c': 1.50, 'model_params/d': 2.1, 'risk_params/e': 579, 'risk_params/f': 710, 'risk_params/g': 5} because of the following error: TypeError("'NoneType' object is not subscriptable"). Traceback (most recent call last): File "/root/.clearml/venvs-builds/3.12/task_repository/quant.git/.venv/lib/python3.12/site-packages/optuna/study/_optimize.py", line 197, in _run_trial value_or_values = func(trial) ^^^^^^^^^^^ File "/root/.clearml/venvs-builds/3.12/task_repository/quant.git/.venv/lib/python3.12/site-packages/clearml/automation/optuna/optuna.py", line 92, in objective objective_metric = objective_metric[0] ~~~~~~~~~~~~~~~~^^^ TypeError: 'NoneType' object is not subscriptable [W 2025-04-08 21:37:59,249] Trial 1 failed with value None.
  
  
Posted 8 days ago
Votes Newest

Answers 6


CostlyOstrich36 Here is HyperparameterOptimizer class

hpo = HyperParameterOptimizer(
        # Base experiment to optimize
        base_task_id=base_task_id,
        # Hyperparameters to tune
        hyper_parameters=param_ranges,
        # Objective metric
        objective_metric_title=[
            metric_title for metric_title in opt_conf.hpo_params.objective_metric_title
        ],
        objective_metric_series=[
            metric for metric in opt_conf.hpo_params.objective_metric_series
        ],
        objective_metric_sign=[
            direction for direction in opt_conf.hpo_params.objective_metric_sign
        ],
        # Optimization strategy
        optimizer_class=OptimizerOptuna,
        # Execution configuration
        execution_queue=opt_conf.hpo_params.execution_queue,
        save_top_k_tasks_only=-1,
        spawn_project=f"{opt_conf.task_params.project_name}/opt",
        min_iteration_per_job=opt_conf.hpo_params.min_iteration_per_job,
        max_iteration_per_job=opt_conf.hpo_params.max_iteration_per_job,
        # pool_period_min=40,
        # time_limit_per_job=120,
        # let us limit the number of concurrent experiments,
        # this in turn will make sure we do dont bombard the scheduler with experiments.
        # if we have an auto-scaler connected, this, by proxy, will limit the number of machine
        max_number_of_concurrent_tasks=opt_conf.hpo_params.max_number_of_concurrent_tasks,
        # set the maximum number of jobs to launch for the optimization, default (None) unlimited
        # If OptimizerBOHB is used, it defined the maximum budget in terms of full jobs
        # basically the cumulative number of iterations will not exceed total_max_jobs * max_iteration_per_job
        total_max_jobs=opt_conf.hpo_params.total_max_jobs,
        # optuna_pruner=pruner_dict.get(
        #     opt_conf.hpo_params.pruner
        # ),  # HyperbandPruner(min_resource=5, max_resource=80),
        # optuna_sampler=sampler_dict.get(opt_conf.hpo_params.sampler),

    )

and here the hpo_params used

hpo_params:
  objective-metric-title: ["HBT-KPI --- 2024-12-26 to 2025-01-12"]
  objective-metric-series: ["SR"]
  objective-metric-sign: ["max"]
  time-limit: 72000.0
  execution-queue: hpo_mmd
  min-iteration-per_job: 50
  max-iteration-per_job: 10000
  max-number-of-concurrent-tasks: 100
  total-max-jobs: 2000
  pruner: none
  sampler: none #random

Do you see any reason for the optimization finish before the total-max-jobs get reached?

I am not even sure if the issue is only that of not getting the metric. if that happens, I suppose the hpo should inject new parameters in the next iteration (since the max was not reached), but instead it stops running, and completes the optimization...

  
  
Posted 7 days ago

Here for instance we had only two cases of TypeError: 'NoneType' object is not subscriptable, one is on line 9846. But as you see in the pic the workers are going down.
image

  
  
Posted 7 days ago

Hi UpsetPanda50 , are you running them on the same machine/agent? Can you please provide a full log of one run that worked and one that didn't on the same machine?

  
  
Posted 7 days ago

And here is how the error appears. Trying to get the metric that was not logged.

  
  
Posted 7 days ago

Still the same errors. :(

  
  
Posted 6 days ago

Hi CostlyOstrich36 , it's a pool of machines. I have attached two logs.

  
  
Posted 7 days ago