but out of curiosity, whats the point on doing a hyperparam search on the value of the loss on the last epoch of the experiment
The problem is that you might end up with global min that is really nice, but it was 3 epochs ago, and you have the last checkpoint ...
BTW, global min and last min should not be very diff if the model converge, wdyt?
I always save the checkpoint of the min/max loss so that wont be a problem. We where having numerical discrepancies with the loss value for the checkpoint and the objective reported on the hyperparam, thats how we noted.
In case of overfitting (using val loss) the last and the min might not be even near, but maybe hyperparam aborts in those cases? I am not too familiar on when hyperparam optmizer does abort an experiment
but maybe hyperparam aborts in those cases?
from the hyperparam perspective it will be trying to optimize the global minimum, basically "ignoring" the last value reported. Does that make sense ?
... the one for the last epoch and not the best one for that experiment,
well
Now we realized there is an option tu use
"min_global"
on the sign, is this what we need?
Yes 🙂 (or max_global)
I think it does make sense and is what we where looking for
Ok we will give it a try, but out of curiosity, whats the point on doing a hyperparam search on the value of the loss on the last epoch of the experiment vs using the value of the min_loss on the full experiment?