Hi, I Have A Small Question Regarding K8S Clearml-Serving Behavior. I Have In My Cluster One Gpu Of 16Gb Ram, And Another One Of 24 Gb Ram. I Have A Llm Model Fitting The 24Gb But Not The 16Gb Gpu. When I Call The Endpoint, How Will I Know To Which Gpu I

Unanswered

Hey @<1523701205467926528:profile|AgitatedDove14> , thank you for your input
Could you clarify what you mean by clearml-serving session?

Are you refering to the servingTaskId ?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					SuccessfulRaven86
				
					0
					 × 1

283 Views

0 Answers

one year ago