I've finally gotten the triton engine to run. I'll be going through nvidia triton docs to find how to make an inference request. If you have an example inference request, I'll appreciate if you can share it with me.
I'm currently installing nvidia docker on my machine, where the agent resides. I was also getting an error regarding gpu not being available in docker since the agent was running on docker mode. I'll share update in a bit. Trying to re run the whole set up
Also the tutorial mentioned serving-engine-ip as a variable but I have no idea what the ip of the serving engine is.
Hi Fawad, maybe this can help you get started! They're both c++ and python examples of triton inference. Be careful though, the pre and postprocessing used is specific to the model (in this case yolov4) and you'll have to change it to your own model's needs