Oh I see, this seems like Triton configuration issue, usually dim -1 means flexible. I can also mention that serving 1.1 should be released later this week with better multiple input support for triton. Does that make sense?
MoodyCentipede68 is diagram 2 a batch processing workflow?
And is Exectuer actually runs something, or is it IO?
Hello AgitatedDove14 , based on the picture below, I think it's stream processing, not batch.
And the executor do preprocessing and create a data to fit to model
If this is the case why not have the stream process call the rest api, then move forward with the result? This way it scales out of the box, the main "conceptual" difference is that the restapi is used internally, and the upside is the event streaming processing becomes part of the application layer, not tied with the compute cost of the model , wdyt?
Yes, that make sense, thank you for your help AgitatedDove14
Actually AgitatedDove14 , let me try to explain my problem more clearly.
When I'm trying to serve my model with clearml-serving, the expected input-size for my AI model is always [1,60,1]. What I need is that model served by clearml-serving can receive the input-size dynamically.Is there any solution for the model to be able to receive the input size dynamically (especially dynamic for the first dimension) like [10,60,1] or [23000,60,1] etc?
Here are some diagram to help me explain this.