Below is an example with one metric reported using multirun. This is taken from a single experiment result page as all runs feed the same experiment. Unfortunately I have no idea what 1
refers to for example. Is it possible to name each run or to break them into several experiments ?
but despite the naming it's working quite well actually
on one experiment it overlays the same metrics (not taking into account the run number)
but I have no idea what's behing
1
,
2
and
3
compare to the first execution
This is why I would think multiple experiments, since it will store all the arguments (and I think these arguments are somehow being lost.
wdyt?
I am not really familiar with TB internal mechanics. For this project we are using Pytorch Lightning
Right, I think the naming is a by-product of Hydra / TB
but when I compare experiments the run numbers are taken into account comparing "1:loss" with "1:loss" and putting "2:loss"s in a different graph
between Hydra, PL, TB and clearml I am not quite sure who is adding the prefix for each run
I think it would make sense to have one task per run to make the comparison on hyper-parameters easier
I agree. Could you maybe open a GitHub issue on it, I want to make sure we solve this issue 🙂
ClearML does
Thanks for doing that ! :i_love_you_hand_sign:
but I have no idea what's behing 1
, 2
and 3
compare to the first execution
It's a running number because PL is creating the same TB file for every run
So the naming is a by product of the many TB created (one per experiment), if you add different naming ot the TB files, then this is what you'll be seeing in the UI. Make sense ?
GloriousPanda26 Are you getting multiple Tasks or is it a single Task ?
yes. As you can see this one has the hydra
section reported in the config
the import order does is not related to the problem
GloriousPanda26 wouldn't it make more sense that multi run would create multiple experiments ?
it's a single taks which contains metrics for all 4 executions
the previous image was from the dashboard of one experiment
but to go back to your question, I think it would make sense to have one task per run to make the comparison on hyper-parameters easier