When Running In

Answered

When Running In

when running in cpu-only mode, is it possible to restrict the amount of cpu given to an agent or it takes all the available cpus on the machine... similarly is it possible to restrict ram as well or not

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					PompousParrot44
				
					0
					 × 1

Votes Newest

Answers 14

i know its not magic... all linux subsystem underneath.. just to configure it in a way as needed 🙂 for now i think i will stick with current setup of cpu-only mode and co-ordinate with in the team. later one when need comes .. will see if we go for k8s or not

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					PompousParrot44
				
					0
					 × 1

PompousParrot44 now that I think about it, you might be able to limit the cpu affinity, would that help?

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hi PompousParrot44
Well this kind of control is tricky. If you don't mind processes "fighting over cpu" you can just spin two trains-agents in cpu-mode. It will work as long as they have a different TRAINS_WORKER_NAME
The other option (might be a bit of an overkill) is to use K8s, which will set the CPU % for the entire agent.
What do you think?

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

the use case i have is to allow people from my team to run their workloads on set of servers without stepping over each other..

So does that mean CPU only workloads?
Also are we afraid of fairness? (i.e. someone "taking" all the CPU for themselves)

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

not really, the OS will almost never allow for that, actually it is based on fairness and priority. we can set the entire agent to have the same low priority for all of them, then the OS will always take CPU when needed (most of the time it won't) and all the agents will split the CPU's among them, no one will get starved 🙂 With GPUs , it is a different story, there is no actual context switching or fairness mechanisms like in CPU

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

i guess i was not so clear may be.. say e.g. you running lightgbm model training, by default it will take all the cpus available on the box and will run that many threads, now another task got scheduled on the same box now you have 2x threads with same amount of CPU to schedule on. So yes the jobs will progress but the progression will not be the same due to context switches which will happen way more than say if we have allowed on 1/2x threads for each job

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					PompousParrot44
				
					0
					 × 1

PompousParrot44 I see what you mean, yes multiple context switching might cause a bit of decline in performance. not sure how much though ... The alternative of course is to set cpu affinity... Anyhow if you do get there we can try to come up with something that makes sense, but at the end there is no magic there 🙂

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

not just fairness but the scheduled workloads will be starved of resources if say someone run training which by default take all the available cpus

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					PompousParrot44
				
					0
					 × 1

thanks for your help AgitatedDove14

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					PompousParrot44
				
					0
					 × 1

the use case i have is to allow people from my team to run their workloads on set of servers without stepping over each other..

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					PompousParrot44
				
					0
					 × 1

thanks... i was just wondering if i overlooked any config option for that... as cpu_set might be possibility to for cpu

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					PompousParrot44
				
					0
					 × 1

i could use k8s but thats bit overkill currently

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					PompousParrot44
				
					0
					 × 1

PompousParrot44 with pleasure. If during your search for a solution you come across something that solves it, and might integrate to the agent, do not hesitate to suggest it :)

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

i don't need this right away.. i just wanted to know the possibility fo dividing the current machine into multiple workers... i guess if its not readily available then may be you guys can discuss to see if it makes sense to have it on roadmap..

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					PompousParrot44
				
					0
					 × 1

Write your answer

2K Views

14 Answers

5 years ago

2 years ago