Hello Channel, I Have A Question Regarding Clearml Serving In Production. I Have Different Environments, And Different Models Each Of Them Linked To A Use Case. I Would Like To Spin Up One Kubernetes Cluster (From Triton Gpu Docker Compose) Taking Into

Unanswered

To be honest, I'm not completely sure as I've never tried hundreds of endpoints myself. In theory, yes it should be possible, Triton, FastAPI and Intel OneAPI (ClearML building blocks) all claim they can handle that kind of load, but again, I've not tested it myself.

To answer the second question, yes! You can basically use the "type" of model to decide where it should be run. You always have the custom model option if you want to run it yourself too 🙂

  				
Posted 
	one year ago

					More  		
  Report
		
					ExasperatedCrab78
				
					0
					 × 1

243 Views

0 Answers

one year ago