Hi, I Have A Small Question Regarding K8S Clearml-Serving Behavior. I Have In My Cluster One Gpu Of 16Gb Ram, And Another One Of 24 Gb Ram. I Have A Llm Model Fitting The 24Gb But Not The 16Gb Gpu. When I Call The Endpoint, How Will I Know To Which Gpu I

Unanswered

Correct the serving Task ID is the clearml serving session. It is the instance that holds all the information of this specific setup and models

  				
Posted 
	one year ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

317 Views

0 Answers

one year ago