Not Able To Resume A Hyper-Parameter Optmization.

When I try to resume a stopped or aborted parameter optimization experiment, it will fail with the error --run_as_service: invalid int value: '[0, 0]' . When checking the experiment's configuration I realized that all arguments are now doubled (i.e. [0, 0] instead of 0), which seems to be related to the fdact that I set max_number_of_concurrent_tasks=2 in the HyperParameterOptimizer . Am I doing something wrong?

Posted one year ago
Is this reproducible with the hpo example here:

What's your clearml version? (And is it possible you verify with the latest version?)

Posted one year ago

thanks for the prompt reply, AgitatedDove14 . Here are some more details:

I am executing locally (i.e. I set args['run_as_service'] = False as in https://github.com/allegroai/clearml/blob/400c6ec103d9f2193694c54d7491bb1a74bbe8e8/examples/optimization/hyper-parameter-optimization/hyper_parameter_optimizer.py#L45 ). Everything was fine until some network issues occurred and my task was aborted, When I restart it, I see these double configurations in the UI.

However, I've just noticed that the same happens when I set args['run_as_service'] = True.

Posted one year ago

It isn't reproducible. I had a stupid typo in my script parsing the arguments twice. Thanks anyways, you got me on the right track! :)

Posted one year ago

No worries 🙂

Posted one year ago

Hi GreasyLeopard35

I try to resume a stopped or aborted parameter optimization experiment,

How are you continuing the HPO? are you runing everything locally? is this with an agent? are you seeing the '[0, 0]' value on the configuration when launching the HPO or when continuing it ?

Posted one year ago
one year ago
one year ago