Unanswered
Hello! Question About
, but are you suggesting sending the requests to Triton frame-by-frame?
yes! trition backend will do the autobatching, and in an enterprise deployment the gRPC loadbalancer will split it across multiple GPU nodes 🙂
154 Views
0
Answers
one year ago
one year ago