DistressedGoat23
We are running a hyperparameter tuning (using some cv) which might take a long time and might be even aborted unexpectedly due to machine resources.
We therefore want to see the progress
On the HPO Task itself (not the individual experiments the one controlling it all) there is the global progress of the optimization metric, is this what you are looking for ? Am I missing something?
DistressedGoat23 , how are you running this hyper parameter tuning? Ideally you need to have
` From clearml import Task
task = Task.init() `
In your running code, from that point onwards you should have tracking
I am using the Task.init()
approach
and then running RandomizedSearchCV
(not using ClearML's HPO).
Trying that now passing verbose=3
to sklearn's class.
I can see the verbose message on the clearml's console tab while the search runs so this is a kind of a poor's man solution to my problem buy may be enough for now.
Regarding using clearml HPO, will it create multiple experiments on the UI for each tested hyperparameters set or would i be able to see all those trials in a single experiment.
This is a curial point for using clearml HPO since comparing dozens of experiments in the UI and searching for the best is just not manageable.
DistressedGoat23 check this example:
https://github.com/allegroai/clearml/blob/master/examples/optimization/hyper-parameter-optimization/hyper_parameter_optimizer.pyaSearchStrategy = RandomSearch
It will collect everything on the main Task
This is a curial point for using clearml HPO since comparing dozens of experiments in the UI and searching for the best is just not manageable.
You can of course do that (notice you can actually order them by scalars they report, and even do nested sorting, by holding the shift and selecting additional columns to sort by)
But it might be easier to collect on a single Task as in the example, as you pointed out 🙂
Any reason not to do that this way ?
Hi DistressedGoat23 , can you please elaborate a bit on what you're like to do?
We are running a hyperparameter tuning (using some cv) which might take a long time and might be even aborted unexpectedly due to machine resources.
We therefore want to see the progress. what hyperparameters set were tested and what was their results summary metrics (i.e avg and stddev of ROCAUC across all cv's).