Thanks for the pointers! Yes we are expecting an uphill struggle trying to get it to work haha. The output of a computer vision detection model is quite monsterous (especially since I do not understand all of it!). We will probably have to rely on a lot of code-borrowing from the yolo repositories haha.
Well, that's what open source is for 😉 code borrowing is like 90% of the job of software engineers 😄
That's a good idea! I think the YOLO models would be a great fit for a tutorial/example like this. We can add it to our internal list of TODOs, or if you want, you could take a stab at it and we'll try to support you through it 🙂 It might take some engineering though! Serving is never drag and drop 🙂
That said, I think it should be quite easy to do since YOLOv8 supports exporting to tensorrt format, which is native to Triton serving which underlies ClearML serving. So the process should be very similar to this blogpost .