Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
I'Ve Noticed A Change From Clearml

I've noticed a change from ClearML 1.13.1 to 1.13.2 relating to the Hydra integration. I'm not yet convinced it's a bug but it could be but it also might just require a different approach because of the Hydra changes in the update. After upgrading to 1.13.2 it broke my pipelines, specifically the parameter_override option when overriding Hydra config groups. Hydra supports hierarchal configs and config groups. So instead of changing a setting in a config file, I could change the whole config file by updating the config group. I've been using this feature in my pipeline for a few versions now. For reference, I'm using hydra-core==1.3.2 .

Example Pipeline Code:

@hydra.main(version_base="1.3", config_path="../configs", config_name="pipeline.yaml")
def clearml_pipeline(cfg: DictConfig):
    pipe = PipelineController(
        name="Quarterly LTV Model Training and Inference",
        project=cfg.clearml.project_name,
        version="0.2.0",
        add_pipeline_tags=False,
    )
    # pipeline parameters
    pipe.add_parameter(
        name="countries",
        default=[
            "Global",
            # "Netherlands",
            # "Belgium",
            # "France",
            # "UKI",
            # "Spain",
            # "CEE",
            # "DACH",
            # "Nordics",
        ],
        description="A list of countries to build models for",
    )

    pipe.add_parameter(
        name="inference_period",
        default=[
            12,
            # 24,
            # 36,
            # 60,
        ],
        description="A list of periods (in months) to run inference for",
    )

    # Sets which queue on ClearML to pull agents for processing the pipeline steps
    pipe.set_default_execution_queue(cfg.get("default_execution_queue"))

    first_stage_name = "stage_etl"

    # the first stage is the ETL step to pass the data to the child steps
    pipe.add_step(
        name=first_stage_name,
        base_task_project=cfg.clearml.project_name,
        base_task_name="ETL Process",
    )

    # we need to iterate over the countries
    # second stage is training
    for country in pipe.get_parameters()["countries"]:
        if country is None:
            second_stage_name = "stage_training_global"
            third_stage_name = "stage_inference_global"
        else:
            second_stage_name = f"stage_training_{country}"
            third_stage_name_base = f"stage_inference_{country}"
        pipe.add_step(
            name=second_stage_name,
            base_task_project=cfg.clearml.project_name,
            parents=[first_stage_name],
            base_task_name="Training",
            parameter_override={
                # get the country from loop over the pipeline parameters
                "Hydra/countries": country,
                # Pass the data from the ETL step to the train step
                "Hydra/datagenerator": "clearml",
                "Hydra/clearml.dataset_id": f"${{{first_stage_name}.parameters.General/dataset_id}}",
            },
        )
...

When run with 1.13.1 the pipeline is happy as a clam. When run with 1.13.2 I get the following error with hydra when the second stage spins up:

Environment setup completed successfully
Starting Task Execution:
force-add of config groups is not supported: '++datagenerator=clearml'
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
2023-12-13 16:59:09
Process failed, exit code 1

This is because of "Hydra/datagenerator": "clearml", line in the pipeline_override 's in the second stage pipeline. I'm guessing something changed under the hood on how ClearML managed Hydra parameter overrides.

  
  
Posted 8 months ago
Votes Newest

Answers 6


@<1523701435869433856:profile|SmugDolphin23> I spoke too soon. It does resolve the error I posted but it introduces a new error. While this error does seem to be related to VS Code the strange thing is it doesn't occur if I run it with earlier versions of clearml .

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/natephysics/.vscode-server/extensions/ms-python.python-2023.22.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/pydevd.py", line 3489, in <module>
    main()
  File "/home/natephysics/.vscode-server/extensions/ms-python.python-2023.22.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/pydevd.py", line 3482, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/home/natephysics/.vscode-server/extensions/ms-python.python-2023.22.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/pydevd.py", line 2510, in run
    return self._exec(is_module, entry_point_fn, module_name, file, globals, locals)
  File "/home/natephysics/.vscode-server/extensions/ms-python.python-2023.22.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/pydevd.py", line 2517, in _exec
    globals = pydevd_runpy.run_path(file, globals, '__main__')
  File "/home/natephysics/.vscode-server/extensions/ms-python.python-2023.22.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 334, in run_path
    mod_name, mod_spec, code = _get_main_module_details()
  File "/home/natephysics/.vscode-server/extensions/ms-python.python-2023.22.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 270, in _get_main_module_details
    raise error("can't find %r module in %r" %
ImportError: can't find '__main__' module in ''
Launching the next 0 steps
Setting pipeline controller Task as failed (due to failed steps) !
  
  
Posted 8 months ago

Hi @<1545216070686609408:profile|EnthusiasticCow4> ! This is actually very weird. Does your pipeline fail when running the first step? What if you run the pipeline via "raw" python (i.e. by doing python3 your_script.py )?

  
  
Posted 8 months ago

Hi @<1545216070686609408:profile|EnthusiasticCow4> ! Can you please try with clearml==1.13.3rc0 ? I believe we fixed this issue

  
  
Posted 8 months ago

There is no issues when I run the "raw" script. Also, since it's based on tasks, the code must have run without fault for it to be pulled as a task in the pipeline.

As for when it fails, looking at the log here it looks like it's on the first task or maybe as the first task is launching. But I'd have to go back to be sure. I rolled back to 1.13.1 and that's working fine. But, if you want I can help explore this bug in detail because it would be nice to find the root of the issue. LmK what you need.

  
  
Posted 8 months ago

This does appear to resolve the issue. I'll keep you updated if I find any other issues. Thanks @<1523701435869433856:profile|SmugDolphin23>

  
  
Posted 8 months ago

@<1523701435869433856:profile|SmugDolphin23> Yes. I'll try it in about 14 hours when I'm back at work and let you know how it goes. 😂

  
  
Posted 8 months ago