Unanswered
Hello, We Use Clearml With A Torch.Distributed (Ddp, On Only 1 Machine But With Multiple Process) Training, And We Found That Clearml Intercepts And Changes The Exit Code Of Our Process (I.E. Exit(1) Does Not Exit 1 Anymore), And Torch.Multiprocessing.Spa
VirtuousFish83 is the exit(1) called from the main process or a subprocess? Are you running it with an agent?
162 Views
0
Answers
3 years ago
one year ago