Hey, just for your information:
I replicated pytorch_lightning_example.py and it works fine (model/artifact saving) with old lightning interface (I mean "import pytorch_lightning as pl" as it is right now in the example) but the issue occurs when I try to use new lightning interface ("import lightning.pytorch as pl"). Hope it helps somehow and still looking forward to fix 🙂
Hi @<1554638160548335616:profile|AverageSealion33> , are you sure the previous clearml version was 1.10.1? This version is only about a week old
Can you please check with the latest 1.10.2 SDK version if the checkpointing issue still happens. As for the example code which couldn't be reproduced, we're already working on it and should have a fix for it for the next minor SDK version
It may indeed be, thanks for letting us know, we’ll try to replicate it
Hey, yes, the reason for this issue seems to be our currently limited support for lightning 2.0. We will improve the support in the following releases. Right now one way to circumvent this issue, that I can recommend, is to use torch.save
if possible, because we fully support automatic model capture on torch.save
calls.
Good catch 🙂 my mistake. It was 1.7.2 (edited)
I've upgraded version from 1.7.2 to 1.10.2 and the problem has occured with the latest version. I have a feeling that it is related to major changes in Lightning with their 2+ version but its only my intuition. It would be nice if you could check it with your simple example code. Looking forward to the updates, thanks.