Unanswered
Hi :slightly_smiling_face:
we have a clearml pipeline that has a step that runs a multi gpu training (with hugginface), we need to invoke it with `accelerate launch` so we use `subprocess.run` inside the step but it hangs when finished.
Is this the righ
Hi 🙂
we have a clearml pipeline that has a step that runs a multi gpu training (with hugginface), we need to invoke it with accelerate launch
so we use subprocess.run
inside the step but it hangs when finished.
Is this the right way to do this? any idea why it hangs?
653 Views
0
Answers
8 months ago
8 months ago
Tags