Hi ObedientToad56 🙂
My question is on how the deployment would be once we have verified the endpoints are working in a local container.
I isn't the deployment just running the inference container? You just open up the endpoints towards where you wanna server, no?
, i thought there will be some hooks for deploying where the integration with k8s was also taken care automatically.
Hi ObedientToad56
Yes you are correct, basically now you have a docker-compose (spinning everything, even though per example you can also spin a standalone container (mostly for debugging).
We are working on a k8s helm chart so the deployment is easier, it will be based on these docker-compose :
https://github.com/allegroai/clearml-serving/blob/main/docker/docker-compose.yml
https://github.com/allegroai/clearml-serving/blob/main/docker/docker-compose-triton-gpu.yml
hmm, i was speaking from a production point of view, i thought there will be some hooks for deploying where the integration with k8s was also taken care automatically.
AFAIK, i have to create a deployment of this container and add an ingress on top of it. In the architecture diagram in github, this seems to be something that is already baked in , which is what caused confusion. Curious to know your thoughts on this.