Is There Any Way To, Like, Load-Balance Automatically? Like, On The User End Can I Just Specify An Amount Of Gb I Think I Will Need, And It Goes And Picks A Queue For Me Based On That? Like, Let'S Say I Want "A 15Gb Gpu Or Better" And There'S 4 Queues, Tw

Answered

Is there any way to, like, load-balance automatically? Like, on the user end can I just specify an amount of GB I think I will need, and it goes and picks a queue for me based on that? Like, let's say I want "a 15GB GPU or better" and there's 4 queues, two of which fit the description... is there any way to set it so that ClearML will just queue it up on whichever one's available?

  				
Posted 
	3 years ago

					More  		
  Report
		
					SmallDeer34
				
					0
					 × 1

Votes Newest

Answers 9

We do have the paid tier, I believe. Anywhere we can go and read up some more on this stuff, btw?

  				
Posted 
	3 years ago

					More  		
  Report
		
					SmallDeer34
				
					0
					 × 1

That answers it, I think!

  				
Posted 
	3 years ago

					More  		
  Report
		
					SmallDeer34
				
					0
					 × 1

The easiest way would be to rename a queue to "1xgpu 16gb", then make sure only machines with >16gb GPUs listen to it.
Note that an agent can listen to Multiple queues

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

awesome! Thanks!

  				
Posted 
	3 years ago

					More  		
  Report
		
					SmallDeer34
				
					0
					 × 1

Like, let's say I want "a 15GB GPU or better" and there's 4 queues, two of which fit the description... is there any way to set it so that ClearML will just queue it up on whichever one's available?

How do you know that? Also if you know that, what do you know about the queues ?
Generally speaking this type of granularity is k8s, but it has lots of caveats, specifically that you need to Know what you need in term of resources, that you can specify resources that do not exist, and that you can oversubscribe resources (i.e. starve processes)

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Then when I queue up a job on the 1x16gb queue it would run on one of the two GPUs?

  				
Posted 
	3 years ago

					More  		
  Report
		
					SmallDeer34
				
					0
					 × 1

OK, so if I've got, like, 2x16GB GPUs ...

You could do:
clearml-agent daemon --queue "2xGPU_32gb" --gpus 0,1Which will always use the two gpus for every Task it pulls

Or you could do:
clearml-agent daemon --queue "1xGPU_16gb" --gpus 0 clearml-agent daemon --queue "1xGPU_16gb" --gpus 1Which will have two agents, one per GPU (with 16gb per Task it runs)

Or
clearml-agent daemon --queue "2xGPU_32gb" "1xGPU_16gb" --gpus 0,1Which will first pull Tasks from the "2xGPU_32gb" queue and if this is empty, it will pull Tasks from "1xGPU_16gb". Notice that in both cases you will be using the two GPUs.

The paid tier includes dynamic-gpus support that allows the last example to actually allocate 1 or 2 gpus based on the queue the Task was pulled from.

Did that asnwer the question, or am I missing something ?

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Good question 🙂
https://clear.ml/docs/latest/docs/clearml_agent#dynamic-gpu-allocation

The latest updated help will always be here as well 🙂
clearml-agent daemon --help

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

OK, so if I've got, like, 2x16GB GPUs and 2x32GB I could allocate all the 16GB GPUs to one Queue? And all the 32GB ones to another?

  				
Posted 
	3 years ago

					More  		
  Report
		
					SmallDeer34
				
					0
					 × 1

Write your answer

1K Views

9 Answers

3 years ago

2 years ago