Hi Everyone, I’M New To Clearml And Server Administration. We Are Considering Tools To Manage A Dgx H100 Server. Ideally, The Tool Could Provide "Sandboxes" That Are Already Equipped With All The Necessary Tools And Frameworks. This Way, Each Team Member

Answered

Hi everyone, I’m new to ClearML and server administration. We are considering tools to manage a DGX H100 server. Ideally, the tool could provide "Sandboxes" that are already equipped with all the necessary tools and frameworks. This way, each team member can work in an isolated pre-configured environments without having to manually set up everything.
My question: Is ClearML suitable for such requirements?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					IdealCamel64
				
					0
					 × 1

Votes Newest

Answers 5

Hi @<1756488209237282816:profile|IdealCamel64> , I think ClearML would be perfect for that. You can also enable users to have their own remote sessions directly to the GPUs (inside a container even). I'd check out ClearML's orchestration layer + remote sessions:
None
None

Regarding what @<1576381444509405184:profile|ManiacalLizard2> said, he's wrong I'm afraid. ClearML can run on top of of K8s if needed and of course the agent supports running inside docker containers as well.

I think that one of ClearML's strengths is allowing you to manage/administer not only a single H100 server but many of those under whatever requirements you might have.

  				
Posted 
	one year ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

@<1523701070390366208:profile|CostlyOstrich36> Thanks for your reply!
I have a couple of follow-up questions:

Are the features mentioned (orchestration layer, remote sessions, etc.) available for testing in the free version of ClearML?
Given the following scenario: In our team, some members prefer ClearML for experiment tracking, while others want to use MLflow. Can we use ClearML to handle server monitoring and orchestration, while still allowing flexibility for users to choose their preferred experiment tracking tool.Thanks again for the help!

  				
Posted 
	one year ago

					More
				  		
  Report
		
					IdealCamel64
				
					0
					 × 1

@<1756488209237282816:profile|IdealCamel64> , to address your questions:

Yes
Yes, but as @<1576381444509405184:profile|ManiacalLizard2> said, let your users try and I'm sure they'll prefer ClearML 🙂

  				
Posted 
	one year ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

if you want to replace MLflow by ClearML: do it !! It's like "Should I use sandal or running shoes for my next marathon ..."
Let your user try ClearML, and I am pretty sure all of them will want to swap over !!!

  				
Posted 
	one year ago

					More
				  		
  Report
		
					ManiacalLizard2
				
					0
					 × 1

Feels like Docker, Kubernetes is more fit for that purpose ...

  				
Posted 
	one year ago

					More
				  		
  Report
		
					ManiacalLizard2
				
					0
					 × 1

Write your answer

1K Views

5 Answers

one year ago