Hello, I Have A Question Regarding The Use Of Hydra For Config In Clearml Pipelines. I Have Setup A Pipeline Using The Pipeline Decorator (Based On The Example You Provided In Github), With The Config Values Hard-Coded In Code, And It Works Well. I Then T

Answered

Hello, I have a question regarding the use of Hydra for config in clearml pipelines. I have setup a pipeline using the Pipeline decorator (based on the example you provided in github), with the config values hard-coded in code, and it works well. I then tried to use hydra instead, but could not get it to work. Here is the basic code:
` from clearml.automation.controller import PipelineDecorator
from clearml import Task
from train import basic_training, train_best_model
from hyperparam_optimizer import run_hpo
import hydra
import dotsi

@PipelineDecorator.pipeline(name='train_and_hpo', project='clearml-example2', version='0.0.1', pipeline_execution_queue="default")
@hydra.main(config_path="config", config_name="config")
def executing_pipeline(cfg):
# Use the pipeline argument to start the pipeline and pass it ot the first step
print('launch step 1')
dataset_settings = cfg.dataset_settings
model_settings = cfg.model_settings
base_task_id = basic_training(dataset_settings, model_settings)
print(base_task_id)
print('launch step 2')
best3 = run_hpo(base_task_id=base_task_id)
print(f"Best task id : {best3[0]}")
best_task = Task.get_task(task_id=best3[0], project_name='clearml-example2')
best_params = best_task.get_parameters_as_dict()
model_settings = dotsi.Dict({"num_conv_filters" : best_params["num_conv_filters"],
"dropout" : best_params["dropout"],
"num_dense_units" : best_params["num_dense_units"],
"num_classes" : best_params["num_classes"],
"num_epochs" : best_params["num_epochs"],
"learning_rate" : best_params["learning_rate"]})
final_train_task_id = train_best_model(dataset_settings, model_settings)
print('pipeline completed')
if name == 'main':
# set the pipeline steps default execution queue (per specific step we can override it with the decorator)
PipelineDecorator.set_default_execution_queue('default')
# run the pipeline steps as subprocess on the current machine, for debugging purposes
# PipelineDecorator.debug_pipeline()

# Start the pipeline execution logic.
executing_pipeline()

print('process completed') `The issue is that the cfg from hydra appears a None when running the pipeline. I am using clearml-agent, and the environment is setup well, but when it starts the task, the config is None.

Also, I am not sure about the order of the decorators. The order posted above got the task to start, but when inverted, I got this message :
` Primary config module 'clearml.automation.config' not found.
Check that it's correct and contains an init.py file

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace. `

  				
Posted 
	3 years ago

					More  		
  Report
		
					YummyLion54
				
					0
					 × 1

Votes Newest

Answers 7

YummyLion54 , Hi

What versions of clearml/clearml-agent are you using?

When running it without the agent, do the hydra configurations show up properly in the UI?

  				
Posted 
	3 years ago

					More  		
  Report
		
					CostlyOstrich36
				
					0

+1?

  				
Posted 
	2 years ago

					More  		
  Report
		
					SmugLizard25
				
					0

Hello, I am using clearml 1.1.4 and clearml-agent 1.1.0. I used hydra before without pipelines locally and it worked well and showed up in the UI properly yes

  				
Posted 
	3 years ago

					More  		
  Report
		
					YummyLion54
				
					0
					 × 1

Could you guys have a sample for pipelines using hydra? I think quite a few people would interested in that, and it isn't as straightforward as I thought it would be

  				
Posted 
	3 years ago

					More  		
  Report
		
					YummyLion54
				
					0
					 × 1

+1 for this 🙂

  				
Posted 
	3 years ago

					More  		
  Report
		
					EmbarrassedSpider34
				
					0
					 × 1

YummyLion54 , let me take a look 🙂

  				
Posted 
	3 years ago

					More  		
  Report
		
					CostlyOstrich36
				
					0

CostlyOstrich36
hello, I tried some stuff, namely creating the pipeline with hydra with functions, based on https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_functions.py .
Before the task starts remotely, I can access the hydra config no problem, but once in remote and the task starts, I got :
Starting Task Execution: Primary config directory not found. Check that the config directory '/root/.clearml/venvs-builds/3.8/task_repository/clearml-example.git/src/config' exists and readable Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.This is my function for the pipeline:
` @hydra.main(config_path="config", config_name="config")
def executing_pipeline(cfg):

model_settings = cfg.model_settings
dataset_settings = cfg.dataset_settings
print(model_settings, dataset_settings)


# create the pipeline controller
pipe = PipelineController(
    project='clearml-example2',
    name='pipeline demo',
    version='1.1',
    add_pipeline_tags=False,
)

# set the default execution queue to be used (per step we can override the execution)
pipe.set_default_execution_queue('default')

# Use the pipeline argument to start the pipeline and pass it ot the first step
print('launch step 1')
pipe.add_parameter(
    name='model_settings',
    description='model_settings',
    default=model_settings
)
pipe.add_parameter(
    name='dataset_settings',
    description='dataset_settings',
    default=dataset_settings
)

pipe.add_function_step(
    name='basic_training',
    function=basic_training,
    function_kwargs=dict(dataset_settings='${pipeline.dataset_settings}', model_settings='${pipeline.model_settings}'),
    function_return=['task_id'],
    cache_executed_step=True, 
    execution_queue="default",
)
pipe.add_function_step(
    name='HPO',
    # parents=['step_one'],  # the pipeline will automatically detect the dependencies based on the kwargs inputs
    function=run_hpo,
    function_kwargs=dict(base_task_id='${basic_training.task_id}'),
    function_return=['best_model_settings'],
    cache_executed_step=True, 
    execution_queue="default",
)
pipe.add_function_step(
    name='train_best_model',
    function=train_best_model,
    function_kwargs=dict(data='${HPO.best_model_settings}'),
    function_return=['model', 'config_pbtxt'],
    cache_executed_step=True, 
    execution_queue="default",
)

# For debugging purposes run on the pipeline on current machine
# Use run_pipeline_steps_locally=True to further execute the pipeline component Tasks as subprocesses.
# pipe.start_locally(run_pipeline_steps_locally=False)

# Start the pipeline on the services queue (remote machine, default on the clearml-server)
pipe.start()
print('pipeline completed') `

  				
Posted 
	3 years ago

					More  		
  Report
		
					YummyLion54
				
					0
					 × 1

Write your answer

1K Views

7 Answers

3 years ago

2 years ago