Unanswered
Hello! Question About
I see, actually what you should do is a fully custom endpoint,
- preprocessing -> doenload video
- processing -> extract frames and send them to Triton with gRPC (see below how)
- post processing, return a human readable answer
Regrading the processing itself, what you need is to take this function (copy paste):
None
have it as internal_process(numpy_frame)
and then have something along the lines of this pseudo code
def process(...):
results_batch = []
for frame in my_video_frame_extractor(file_name_here)
np_frame = np.array(frame)
result = self.executor.submit(self._process, data=np_frame)
results_batch += [result]
if len(results_batch) == BATCH_SIZE:
# collect all the results back
# and clear the batch
results_batch = []
This will scale horizontally the GPU pods, as well as autobatch the inference 🙂
151 Views
0
Answers
one year ago
one year ago