Hi CheerfulGorilla72 ,
What are you logging? Can you provide a small snippet or a screenshot?
DeterminedCrab71 that is a good point, how does plotly adjust for nans on graphs?
CostlyOstrich36
Will wait)
not nice that this logging is misleading
If you choose between skipping or logging like nan, then here I find it difficult, it seems that it is better to log than skip, but you need to think.
So I "think" the issue is plotly (UI), doesn't like NaN and also elastic (storing the scalar) is not a NaN fan. We need to check if they both agree on the representation, that it should be easy to fix...
Maybe you could open a github issue, so we do not forget?
[ctrl + C] [ctrl + V]
https://github.com/allegroai/clearml/issues/604
class LitMNIST(LightningModule): ... self.log('test/test_nan', np.nan, prog_bar=False, logger=True, on_step=True, on_epoch=False) ...
CheerfulGorilla72 , I will take a look soon 🙂
AgitatedDove14 CostlyOstrich36 CheerfulGorilla72
please note, NaN isn't part of the JSON spec. and only Python implementation of JSON supports it.
so you have to either convert NaN to 0, as we chose to do, or drop them.
AgitatedDove14
if I had to choose between logging or not logging, I would choose logging
If you choose between logging as 0 or as nan, then I would choose as nan
If you choose between skipping or logging like nan, then here I find it difficult, it seems that it is better to log than skip, but you need to think.
to a greater extent, we are used to the tensorboard, where nan is logged in a special way, and this behavior seems to be natural.
AgitatedDove14
in the browser a NaN will crash JSON.parse(), so we don't have them on graphs.
CostlyOstrich36
*If the agent did not perform a certain action, then its average reward per episode for this action will be nan , not 0
CostlyOstrich36
usability of the pytorch_lightning logger
we log the average reward of each action for the RL agent.
If the agent you did this action on the current episode, then his average reward will be nan , not 0. for obvious reasons. And we would like it to be visualized in the same way as in the tensorboard, for informational content.
CheerfulGorilla72
upd: I see NAN in the tensorboard, and 0 in Clearml.
I have to admit, since NaN's are actually skipped in the graph, should we actually log them ?
CheerfulGorilla72 , can you point me to where in the script the reported scalars are?
I think this might be happening because you can't report None
for Logger.report_scalar()
so the auto logging assigns it some sort of value - 0. What is your use case? If the value of the scalar is None
then why log it?