Unanswered
Hi :slightly_smiling_face:
we have a clearml pipeline that has a step that runs a multi gpu training (with hugginface), we need to invoke it with `accelerate launch` so we use `subprocess.run` inside the step but it hangs when finished.
Is this the righ
Hi 🙂
we have a clearml pipeline that has a step that runs a multi gpu training (with hugginface), we need to invoke it with accelerate launch so we use subprocess.run inside the step but it hangs when finished.
Is this the right way to do this? any idea why it hangs?
2K Views
0
Answers
one year ago
one year ago
Tags