this is pretty weird. PL should only save from rank==0 :
https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/trainer/connectors/checkpoint_connector.py#L394
No running duplicate exps. Which repo to report? clearml-agent, clearml or clearml-server.
Hi! Looks like all the processes are calling torch.save so it's probably reflecting what Lightning did behind the curtain. Definitely not a feature though. Do you mind reporting this to our github repo? Also, are you also getting duplicate experiments?
DefeatedOstrich93 can you verify lightning actually only stored once ?