Unanswered
Is There A Way To Configure A Clearml-Agent So That It Shutdown The Server After It Has Been Idle For A Certain Time Period? We Are Using Gpu Resources From A Provider That Autoscaling Doesn'T Support (Such As Sagemaker Training Jobs).
Glad to see it works (thanks for sharing HighRaccoon77 ).
I have a question on Dynamic GPU allocation , disregarding any autoscaling considerations:
Let’s say we spin up a clearML agent on an 8 GPU instance (via a launcher script as HighRaccoon77 is doing), with --dynamic-gpus enabled, catering to 2 gpu queue and a 4 gpu queue. The agent pulls in a new task that only requires 2 GPU’s, and while that task is ongoing, a new task that requires 4 GPU’s is placed in the 4 GPU queue. Does the agent need to complete the 2 GPU first task before launching the 4 GPU task? Or can they run concurrently? CostlyOstrich36
184 Views
0
Answers
one year ago
one year ago