. I wonder if I can extend this to reporting grad_norm per layer.
oh that makes sense, technically I assume so, is this a HF logger option? notice ClearML is already integrated with HF on the HF side, do they report that when TB logger is used?
I mean that HF trainer by default reports to clearml a single grad_norm scalar for the whole model. I wonder if I can extend this to reporting grad_norm per layer.
I guess they don’t, is there an easy way to add to the HF trainer some callbacks for reporting extra info?
Hi PanickyBee11
You mean this is not automatically logged? do you have a callback that logs it in HF?