Unanswered
Hey Guys, I'M Experiencing Seemingly Random Problems With The Experiments. There Are 4 Gpus And 8 Workers (2 Workers Per Gpu) , And Sometimes Experiments Randomly Fail (Or Complete) In The Middle Of The Epoch Without Any Additional Info In The Logs. What
it might be that there is not enough space on our SSD, experiments cache a lot of preprocessed data during the first epoch...
194 Views
0
Answers
4 years ago
one year ago