Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi All! I’M Currently Working On A Project Where I’M Making Use Of Clearml For Hyperparameter Tuning. In My Workflow, I Have A Python Script That I Usually Run With The Following Command:

Hi all!

I’m currently working on a project where I’m making use of ClearML for hyperparameter tuning.
In my workflow, I have a Python script that I usually run with the following command:

python3 train.py config object_detection.yaml 

This command uses an argparse configuration from the ‘object_detection.yaml’ file.
I’m trying to figure out how to incorporate this argparse configuration into my ClearML hyperparameter tuning process.

When I try to run the hyperparameter tuning in ClearML based on the task derived from the Python script above, I encounter the following error:

usage: train.py [-h] [--local_rank LOCAL_RANK] [--seed SEED] config train.py: error: the following arguments are required: config 

It seems the error arises because the args from the hyperparameter configuration are not being included.

So, my question is: How can I ensure that the hyperparameter tuning process in ClearML executes the Python code with the argparse configuration as defined in the hyperparameter configuration.

Attached to this post, you will find my logs, configurations, and python files.
I am hopeful that someone in this community might have encountered a similar issue and could provide some guidance or suggestions on how to solve this problem.

I would greatly appreciate any guidance or suggestions on this matter.

Thank you in advance.
Best regards,
image

  
  
Posted 10 months ago
Votes Newest

Answers 12


Thank you for giving me the advice.

To answer your question, here is my workflow.

First, I create the task by running the below code

python3 train.py config object_detection.yaml

And in the same docker image, I run the below command to executing an agent

clearml-agent daemon --queue default --forground

After that, use this task id created above, I run the code I shared clearml_hyper.py

So I think argparser arguments are injected in the task itself before HPO

  
  
Posted 10 months ago

Thank you for giving me the advice @<1523701087100473344:profile|SuccessfulKoala55> @<1523701205467926528:profile|AgitatedDove14> !!

  
  
Posted 10 months ago

@<1566959357147484160:profile|LazyCat94> I think this issue happens because you're calling parse_args() before calling task.init

  
  
Posted 10 months ago

here is the full log of the failed task

  
  
Posted 10 months ago

Hi @<1566959357147484160:profile|LazyCat94>
So it seems the arg parser is detecting the configuration YAML
The first thing I would suggest is changing it to a relative path (so that when launched on remote machines it will find the YAML file)

Regardless how are you launching the HPO ? are you spinning a new agent ?
(as background, argparser arguments are injected in realtime by the agent or the HPO when running as subprocesses)

  
  
Posted 10 months ago

Yes, in the train.py, I put the task.init

  
  
Posted 10 months ago

@<1566959357147484160:profile|LazyCat94>
I found the issue, the import of clearml should be before anything else, this way it patch the Argparser before using it

from clearml import Task

Move it to the first line, everything should work 🙂

  
  
Posted 10 months ago

And this is the HPO’s configuration info
image

  
  
Posted 10 months ago

And this is the task configuration info
image

  
  
Posted 10 months ago

This is odd, can you send th full log of the failed Task and if possible the code?

  
  
Posted 10 months ago

and this is the Nanodet modified train code

  
  
Posted 10 months ago

What are you seeing in the Task that was cloned (i.e. the one the HPO created not the original training task)?
by that I mean, configuration section, do you have the Args there ? (seems like the pic you attached, but I just want to make sure)

Also in the train.py file, do you also have Task.init ?

  
  
Posted 10 months ago