Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Hey Clearml Community. A While Back I Was Asking How One Can Perform Inference On A Video With Clearml-Serving, Which Includes An Ensemble, Preprocessing, And Postprocessing. Back Then

Hey ClearML community. A while back I was asking how one can perform inference on a video with clearml-serving, which includes an ensemble, preprocessing, and postprocessing.

Back then @<1523701205467926528:profile|AgitatedDove14> suggested that we override the process() function as well, and set it up so each frame is asynchronously sent to the model, by copy pasting the original process() function here , and calling it _process() and sending each frame individually, and eventually await for it for every batch_size.

However, we’ve came across some serious performance issues compared to setting this up on vanila Triton.

I’m not entirely sure why, but the gRPC client setup that I’ve seen from the examples is different that the one used in ClearML-serving. For instance, each frame (image) takes ~2 seconds just to flatten() ( link ).

Overall, inference on clearML takes ~ 16seconds for a single batch (size=8) on ClearML using the above approach, and only like 0.2s on Triton . GPU usage is also substantially less and infrequent on the clearML side.

We’d like to continue and even improve this community , I just wanted to bring this up and brainstorm, and get any insights one might have. Thanks!

Posted 8 months ago
Votes Newest

Answers 2

This is basically what I follow for setting up my own Triton server:


Posted 8 months ago

This is the gist of our current setup using the recommended approach

Posted 8 months ago