Hello! Question About

Unanswered

I see, actually what you should do is a fully custom endpoint,

preprocessing -> doenload video
processing -> extract frames and send them to Triton with gRPC (see below how)
post processing, return a human readable answer
Regrading the processing itself, what you need is to take this function (copy paste):
None
have it as internal _process(numpy_frame) and then have something along the lines of this pseudo code

def process(...):
  results_batch = []
  for frame in my_video_frame_extractor(file_name_here)
    np_frame = np.array(frame)
    result = self.executor.submit(self._process, data=np_frame)
    results_batch += [result]
    
    if len(results_batch) == BATCH_SIZE:
        # collect all the results back
        # and clear the batch
        results_batch = []

This will scale horizontally the GPU pods, as well as autobatch the inference 🙂

  				
Posted 
	one year ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

199 Views

0 Answers

one year ago