Happy Friday Everyone ! We Have A New Repo Release We Would Love To Get Your Feedback On

Answered

Happy Friday everyone !
We have a new repo release we would love to get your feedback on 🚀
🎉 Finally easy FRACTIONAL GPU on any NVIDIA GPU 🎊
Run our nvidia-cuda flavor containers and get driver level memory limit ! ! !
Which means multiple containers on the same GPU will not OOM each other!
Let us know what you think - None
You can test now with ✨

docker run -it --rm --gpus 0 --pid=host clearml/fractional-gpu:u22-cu12.3-2gb nvidia-smi

notice that

nvidia-smi

inside the container reports total of only 2GB instead of your maximum GPU memory
full list of containers / mem limit in the

  				
Posted 
	one year ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Votes Newest

Answers 10

is it in the OSS version too?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					PerplexedRaccoon19
				
					0
					 × 1

@<1535069219354316800:profile|PerplexedRaccoon19>

is it in the OSS version too?

Yep, free of charge ❤

  				
Posted 
	one year ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

@<1524922424720625664:profile|TartLeopard58> @<1545216070686609408:profile|EnthusiasticCow4>
Notice that when you are spinning multiple agents on the same GPU, the Tasks should request the "correct" fractional GPU container, i.e. if they pick a "regular" no mem limit.
So something like

CLEARML_WORKER_NAME=host-gpu0a clearml-agent daemon --gpus 0 clearml/fractional-gpu:u22-cu12.3-2gb
CLEARML_WORKER_NAME=host-gpu0b clearml-agent daemon --gpus 0 clearml/fractional-gpu:u22-cu12.3-2gb

Also remeber to add --pid=host to your conf file extra_docker_arguments
None

  				
Posted 
	one year ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I’m also curious if it’s available to bind the same GPU to multiple queues.

  				
Posted 
	one year ago

					More
				  		
  Report
		
					TartLeopard58
				
					0
					 × 1

How does it work with k8s?

You need to install the clearml-glue and them on the Task request the container, notice you need to preconfigure the clue with the correct Job YAML

  				
Posted 
	one year ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Is there currently a way to bind the same GPU to multiple queues? I believe the agent complains last time I tried (which was a bit ago).

  				
Posted 
	one year ago

					More
				  		
  Report
		
					EnthusiasticCow4
				
					0
					 × 1

AMAZING!

  				
Posted 
	one year ago

					More
				  		
  Report
		
					PerplexedRaccoon19
				
					0
					 × 1

@<1545216070686609408:profile|EnthusiasticCow4>

Is there currently a way to bind the same GPU to multiple queues? I believe the agent complains last time I tried (which was a bit ago)

run multiple agents on the same GPU,

CLEARML_WORKER_NAME=host-gpu0a clearml-agent daemon --gpus 0
CLEARML_WORKER_NAME=host-gpu0b clearml-agent daemon --gpus 0

  				
Posted 
	one year ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

How does it work with k8s? how can I request the two pods to sit on the same gpu?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					PerplexedRaccoon19
				
					0
					 × 1

That's great! I look forward to trying this out.

  				
Posted 
	one year ago

					More
				  		
  Report
		
					EnthusiasticCow4
				
					0
					 × 1

Write your answer

2K Views

10 Answers

one year ago