thanks! so basically for experiments that are already finished I have no way to compare ATM, right?
right now the situation is problematic, because as I mentioned, I can't compare the training process between different batch sizes (or effective batch size, if I use a different number of GPUs)
Hmm... scaling these scalars while reporting might be a bit too much to do in the background, don't you think you will loose transparency as in the TB you'll see graphs that are diff from what you see in the system ?
ShallowCat10 Thank you for the kind words 🙂
so I'll be able to compare the two experiments over time. Is this possible?
You mean like match the loss based on "images seen" ?
I doesn't really matter to me. One solution I had in mind is that this can be done by the web client on demand, meaning you can manually (or using the Task
object) specify how many iteration constitute a single epoch, and instead of scaling the plots will just be subsampled (or interpolated)
I mean manually you can get the results and rescale but, not through the UI
So obviously the straight forward solution is to report normalize the step value when reporting to TB, i.e. int(step/batch_size). Which makes sense as I suppose the batch size is known and is part of the hyper-parameters. Normalization itself can be done when comparing experiments in the UI, and in the backend can do that, if given the correct normalization parameter. I think this feature request should actually be posted on GitHub, as it is not as simple as one might think (the UI needs to allow you to select parameter for comparison, then the question is do we normalize all the scalars or just a few etc.)
Anyhow if we have enough people interested we can definitely add it :)