Unanswered
Hi, If I'Ve Clearml Agents Installed On Several Servers, Each With A Single Gpu. How Can I Train A Gpt2 Model That Would Require Multiple Gpus?
IMHO ClearML would just start the execution on multiple hosts. Keep in mind that the hosts need to be on the same LAN and have a very high bandwidth.
What you are looking for is called "DistributedDataParallel". Maybe this tutorial gives you a starting point:
None
174 Views
0
Answers
one year ago
one year ago