Hi All! Question Around Resource Management Using

Answered

Hi all! Question around resource management using HyperParameterOptimizer , more specifically GPU management.
Say I’m running an optimisation task on a machine that has 4 GPUs. Each time a new task is cloned, they started running the code on the default GPU. GPU memory is sufficient for running one task. However, if there’s 4 concurrent jobs running on the default GPU, they end up running into CUDA out-of-memory error.
Is there a way to manage each task within the optimizer so that they occupy the remaining free GPUs rather than running on the same one?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					DefeatedMoth52
				
					0
					 × 1

Votes Newest

Answers 8

DefeatedMoth52 how many agents do you have running on the same GPU ?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AgitatedDove14 Since the agents are running on the server and set-up by k8s, more than 1 agent can run on the same GPU. So far I’ve tried running up to 4 concurrent tasks. This means if they all get cloned to use the same GPU, max agent would be 4.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					DefeatedMoth52
				
					0
					 × 1

Oh that makes sense, This depends on how you setup the clearml k8s glue, (becuase the resource allocation is done by k8s) a good hack to limit the number of containers per GPU is to set a RAM limitation per pod, then k8s will know to limit the number of pods on the same GPU machine,
wdty?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hi Martin, thanks for the explanation! I work with Maggie and help with the ClearML setup.

Just to be sure, currently, the PodTemplate contains:

resources: limits: nvidia.com/gpu: 1
you are suggesting to add also, something like:
requests: memory: "100Mi" limits: memory: "200Mi"is that correct?

On a related note, I am a bit puzzled by the fact that all the 4 GPUs are visible.
In the https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/ , it says:
Containers (and Pods) do not share GPUs. There's no overcommitting of GPUs.so I am surprised that this situation happens.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SarcasticSquirrel56
				
					0
					 × 1

Containers (and Pods) do not share GPUs. There's no overcommitting of GPUs.Actually I am as well, this is Kubernets doing the resource scheduling and actually Kubernetes decided it is okay to run two pods on the Same GPU, which is cool, but I was not aware Nvidia already added this feature (I know it was in beta for a long time)
https://developer.nvidia.com/blog/improving-gpu-utilization-in-kubernetes/
I also see thety added dynamic slicing and Memory Proteciton:
Notice you can control the number of pods per GPU

This mechanism for enabling “time-sharing” of GPUs in Kubernetes allows a system administrator to define a set of “replicas” for a GPU, each of which can be handed out independently to a pod to run workloads on. Unlike MIG, there is no memory or fault-isolation between replicas, but for some workloads this is better than not being able to share at all. Internally, GPU time-slicing is used to multiplex workloads from replicas of the same underlying GPU.

https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/gpu-sharing.html#introduction
https://github.com/NVIDIA/k8s-device-plugin#shared-access-to-gpus-with-cuda-time-slicing
Lastly are you using MIG enabled devices? If so you can limit the memory per shared Pod:
https://github.com/NVIDIA/k8s-device-plugin/blob/e2b4ff39b5b4cebe702c8aa102b914b03f6eb81d/README.md#configuration-option-details

Back to the original remark SarcasticSquirrel56 limiting the Pod allocation can also be done via the general k8s "memory limit" requirement, which will take place on top of the GPU plugin, so essentially let's say we have a node with 100GB RAM we can have a pod limit of 25GB, which means no more than 4 pods will be running on the same node.
Does that make sense ?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hi Martin, I admit I don't know about MIG I'll have to ask some of our engineers.

As for the memory, yes the reasoning is clear, the main thing we'll have to see is hot define the limits, because we have nodes with quite different resources available, and this might get tricky, but I'll try and let's see what happens 🙂

We actually plan to create different queues for different types of workloads, we are a bit seeing what the actual usage is to define what type of workloads make sense for us.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SarcasticSquirrel56
				
					0
					 × 1

We actually plan to create different queues for different types of workloads, we are a bit seeing what the actual usage is to define what type of workloads make sense for us.

That sounds like a great path to take, it will make it very clear fro users on what they will be getting and why they should use specific queue.

As for the memory, yes the reasoning is clear, the main thing we'll have to see is hot define the limits, because we have nodes with quite different resources available, and this might get tricky, but I'll try and let's see what happens

Not sure if it helps, but I think this resource limitation (or book-keeping if you will) is one of the advanced feature of the enterprise version, But I would probably start with something simple just to get going before jumping to it.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Thanks Martin!

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SarcasticSquirrel56
				
					0
					 × 1

Write your answer

2K Views

8 Answers

2 years ago