Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hpo With Optuna Via Webapp (Pro Version) Is Not Working As Expected. I Generated A Dummy Task That Logs A Random Value Into Some Metric And Closes, The Code:

HPO with optuna via webapp (pro version) is not working as expected.
i generated a dummy task that logs a random value into some metric and closes, the code:



@hydra.main(config_path=".", config_name=MAIN_CONFIG_FILE, version_base=None)
def main(cfg: DictConfig):
    """
    Main script to set up and execute the training pipeline.

    Parameters
    ----------
    cfg: DictConfig
        A dictionary containing the configurations from main config and sub-configs from configs directory.
    """
    master_config = OmegaConf.to_container(cfg, resolve=True)
    task, clearml_logger = initialize_clearml_task(**master_config.pop("clearml"))  # returns the task object and its logger object
    for i in range(100):
        clearml_logger.report_scalar("Val/Metrics", "AUC", torch.randint(0, 100, (1,)).item(), i)
    task.flush(wait_for_uploads=True)
    task.close()

if __name__ == "__main__":
    main()

i then run HPO using optuna via the webapp and getting this errors (in multiple threads):

Exception in thread Thread-2 (_daemon):
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.11/site-packages/clearml/automation/optimization.py", line 1923, in _daemon
    self.optimizer.start()
  File "/usr/local/lib/python3.11/site-packages/clearml/automation/optuna/optuna.py", line 198, in start
    self._study.optimize(
  File "/usr/local/lib/python3.11/site-packages/optuna/study/study.py", line 451, in optimize
    _optimize(
  File "/usr/local/lib/python3.11/site-packages/optuna/study/_optimize.py", line 99, in _optimize
    f.result()
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/optuna/study/_optimize.py", line 159, in _optimize_sequential
    frozen_trial = _run_trial(study, func, catch)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/optuna/study/_optimize.py", line 247, in _run_trial
    raise func_err
  File "/usr/local/lib/python3.11/site-packages/optuna/study/_optimize.py", line 196, in _run_trial
    value_or_values = func(trial)
                      ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/clearml/automation/optuna/optuna.py", line 93, in objective
    iteration_value = iteration_value[0]
                      ~~~~~~~~~~~~~~~^^^
TypeError: 'NoneType' object is not subscriptable
`Study.stop` is supposed to be invoked inside an objective function or a callback.

which makes the HPO task to fail/abort, but for some reason as completed successfully task (would expect a fail or at least aborted here...).
the hpo manages to run a few experiments before it stops sending new ones due to the raised error, and metrics values are reported and are visible via the app UI.
full HPO task log and template task log are attached. some of the tasks that were generated are aborted (but completed the entire reporting iterations) and some are completed (as expected).

would love to get some help here, i noticed that many users encounters this issue, but didn't find any solutions in this channel.

  
  
Posted 4 months ago
Votes Newest

Answers 2


Hi, sure, there is nothing special there, even some redundancy.

def initialize_clearml_task(
        project_name: str = None,
        task_name: str = None,
        task_type: str = None,
        tags: list[str] = None,
) -> tuple[Task, Logger]:
    """
    Initialize and configure a ClearML task.

    Parameters
    ----------
    project_name : str
        Name of the ClearML project.
    task_name : str
        Name of the ClearML task.
    task_type : str
        Type of the ClearML task.
    tags : list[str]
        List of tags to be assigned to the task.

    Returns
    -------
    tuple[Task, Logger]
        A tuple containing the ClearML task, and the logger.
    """
    task = Task.current_task()
    if task is None:
        task = CADLUtils.init_clearml_task(
            project_name=project_name,
            task_name=task_name,
            task_type=task_type,
            tags=tags
        )
    logger = task.get_logger()

    return task, logger

class CADLUtils:
    @staticmethod
    def init_clearml_task(project_name: str, task_name: str, task_type: str, tags: list[str] | None = None) -> Task:
        """
        Initializes a ClearML task for the current project.

        Parameters:
        -----------
        project_name : str
            The name of the project. for nested projects, use the format 'parent_project/child_project'.
        task_name : str
            The name of the task.
        task_type : str
            The type of the task. choose from clearml.Task.TaskTypes.

        Returns:
        --------
        Task
            The initialized ClearML task.

        Example:
        --------
        task = CADLUtils.init_clearml_task(project_name='my_project', task_name='my_task', task_type='training')
        """
        task = Task.init(
            project_name=project_name,
            task_name=task_name,
            task_type=task_type,
            tags=tags,
            auto_connect_frameworks={"pytorch": False},
            reuse_last_task_id=False,
            # output_uri="
",
        )
        return task

will try with the newest version as well

  
  
Posted 4 months ago

Hi @<1594863230964994048:profile|DangerousBee35> , can you try with the latest clearml version? can you share initialize_clearml_task function?

  
  
Posted 4 months ago
600 Views
2 Answers
4 months ago
4 months ago
Tags
hpo
Similar posts