Unanswered
Hi Everyone, I’M New To Clearml And Server Administration. We Are Considering Tools To Manage A Dgx H100 Server. Ideally, The Tool Could Provide "Sandboxes" That Are Already Equipped With All The Necessary Tools And Frameworks. This Way, Each Team Member
Hi @<1756488209237282816:profile|IdealCamel64> , I think ClearML would be perfect for that. You can also enable users to have their own remote sessions directly to the GPUs (inside a container even). I'd check out ClearML's orchestration layer + remote sessions:
None
None
Regarding what @<1576381444509405184:profile|ManiacalLizard2> said, he's wrong I'm afraid. ClearML can run on top of of K8s if needed and of course the agent supports running inside docker containers as well.
I think that one of ClearML's strengths is allowing you to manage/administer not only a single H100 server but many of those under whatever requirements you might have.
20 Views
0
Answers
2 months ago
2 months ago