My current training setup is a hyperparameter optimization using the TPEsampler from Optuna. For configuration we use Hydra. There is a very nice plugin that let's you define the hyperparameters in the config files using the sweep
argument. The consequence is that I run my train file with the --multirun
parameter and before each new trial starts, it checks the current status and proposes new parameters based off the results until that point.
I'm struggling a bit to integrate this setup with ClearML since the way hyperparameter optimization is handled is a bit different. I would prefer it nicely integrated since we get an overview/parent task and also the option to do parallel processing using remote agents (even if it comes at a slight cost since TPE is better serially).
I seem to have a couple of choices:
- Fix outside of ClearML. I could rewrite the configuration in code so that I can use the
HyperParameterOptimizer
from ClearML.
- Fix inside of ClearML. ClearML reads the config files that I already have and turns it into the ClearML way of working with the hyperparameters. This would require a change in the ClearML codebase, I think, since I cannot find a way to make this with the current codebase. I am willing to contribute if it's not a lot of work (I haven't really estimated the time it would cost me yet). However, this only makes sense if you're accepting PRs like this.
I am thinking that the second option is better for the community if it would be accepted and you'd be willing to support this. Would it be something you'd consider supporting?