Unanswered
Hi,
I Have A Small Question Regarding K8S Clearml-Serving Behavior. I Have In My Cluster One Gpu Of 16Gb Ram, And Another One Of 24 Gb Ram. I Have A Llm Model Fitting The 24Gb But Not The 16Gb Gpu. When I Call The Endpoint, How Will I Know To Which Gpu I
The servingtaskid is linked to the helm chart, which means that your solution would propose to create multiple kubernetes cluster according to our requirements, no?
49 Views
0
Answers
5 months ago
5 months ago