Unanswered
Hello, We Use Clearml With A Torch.Distributed (Ddp, On Only 1 Machine But With Multiple Process) Training, And We Found That Clearml Intercepts And Changes The Exit Code Of Our Process (I.E. Exit(1) Does Not Exit 1 Anymore), And Torch.Multiprocessing.Spa
VirtuousFish83 Hi 🙂
What versions are you running with? ClearML, ClearML-Agent, Torch, Lightning. Which OS are they run on and with what python version.
Do you maybe have a snippet to play around with to try and reproduce the issue?
176 Views
0
Answers
3 years ago
one year ago