So the naming is a by product of the many TB created (one per experiment), if you add different naming ot the TB files, then this is what you'll be seeing in the UI. Make sense ?
but I have no idea what's behing
1
,
2
and
3
compare to the first execution
This is why I would think multiple experiments, since it will store all the arguments (and I think these arguments are somehow being lost.
wdyt?
yes. As you can see this one has the hydra
section reported in the config
it's a single taks which contains metrics for all 4 executions
the previous image was from the dashboard of one experiment
but to go back to your question, I think it would make sense to have one task per run to make the comparison on hyper-parameters easier
I am not really familiar with TB internal mechanics. For this project we are using Pytorch Lightning
but when I compare experiments the run numbers are taken into account comparing "1:loss" with "1:loss" and putting "2:loss"s in a different graph
between Hydra, PL, TB and clearml I am not quite sure who is adding the prefix for each run
on one experiment it overlays the same metrics (not taking into account the run number)
Right, I think the naming is a by-product of Hydra / TB
but despite the naming it's working quite well actually
GloriousPanda26 wouldn't it make more sense that multi run would create multiple experiments ?
Below is an example with one metric reported using multirun. This is taken from a single experiment result page as all runs feed the same experiment. Unfortunately I have no idea what 1
refers to for example. Is it possible to name each run or to break them into several experiments ?
GloriousPanda26 Are you getting multiple Tasks or is it a single Task ?
It's a running number because PL is creating the same TB file for every run
but I have no idea what's behing 1
, 2
and 3
compare to the first execution
the import order does is not related to the problem
I think it would make sense to have one task per run to make the comparison on hyper-parameters easier
I agree. Could you maybe open a GitHub issue on it, I want to make sure we solve this issue 🙂
ClearML does
Thanks for doing that ! :i_love_you_hand_sign: