everything is weird about this. I put two models in the same endpoint, then only one was running, then I started another docker container having a different port number and then the curls with the new model endpoint (with the new port) started working
I put two models in the same endpoint, then only one was running,
without providing version number, you are overriding the models (because this is the same endpoint)
I started another docker container having a different port number and then the curls with the new model endpoint (with the new port) started working
Seems like misconfiguration on the first one?
, which apparently I can't specify when I establish the model endpoint but I need to re compose the docker container by passing an env variable to it????
When you spin the model you can tell it any additional packages you might need
also random tasks are popping up in the DevOps project in the UI
Not random 🙂 these are the service instances (basically increased visibility into what's going on inside the serving containers
ConvolutedSealion94 Let me try to explain how it works, I hope this will help in debugging.
There are two different entities here
Clearml-server: In this context clearml server acts as a control-plane, it stores configuration on the different endpoints, models, preprocessign code etc. It does Not perform any compute or serving clearml-serving-inference https://github.com/allegroai/clearml-serving/blob/e09e6362147da84e042b3c615f167882a58b8ac7/docker/docker-compose-triton-gpu.yml#L77 . This is the actual container that does the serving, serving multiple models from different endpoint The docker-compose (or helm chart) That spins the clearml-serving-inference
. Since the design supports multiple different sets of clearml-serving-inference
(i.e. each one can server different sets of models, imagine different frameworks, or HW requirements etc.). For each copy of clearml-serving-inference
you need to specify which models it needs to serve, this is the Clearml-Serving Session ID
This is the UID that points to the actual Task that stores the configuration for This specific clearml-serving-inference
. You can have multiple instances of
clearml-serving-inference for load balancing, but I will not get into that here.Basically the CLI (i.e. clearml-serving command line) is configuring the clearml-server (i.e. the controlplane), it does Not however spin the actual serving containers (
clearml-serving-inference ) it configures them. In order to configure a specific
clearml-serving-inference
the CLI needs to specify the correct "clearml-serving sesison ID" that this container was spinned with ( https://github.com/allegroai/clearml-serving/blob/e09e6362147da84e042b3c615f167882a58b8ac7/docker/example.env#L6 )
Does this help ?
Hi ConvolutedSealion94
You can archive / delete the SERVING-CONTROL-PLANE
Task from the DevOps project in the UI.
Do notice you will need to make sure the clearml-serving is updated with a new sesison ID or remove it (i.e. take down the pods / docker-compose)
Make sense ?
Were you able to interact with the service that was spinned? (how was it spinned?)
also random tasks are popping up in the DevOps project in the UI
these are the service instances (basically increased visibility into what's going on inside the serving containers
But these have: different task ids, same endpoints (from looking through the tabs)
So I am not sure why they are here and why not somewhere else
What does spin mean in this context?
This line:docker-compose --env-file example.env -f docker-compose-triton-gpu.yml up
But these have: different task ids, same endpoints (from looking through the tabs)
So I am not sure why they are here and why not somewhere else
You can safely ignore them for the time being 🙂
but is it true that I can have multiple models on the same docker instance with different endpoints?
Yes! this is exactly the idea (and again I'm not sure what's going on with the actual containers you have spinned , it seems they are not actually picking the configuration and that they server zero models?!)
and immediately complained about a package missing, which apparently I can't specify when I establish the model endpoint but I need to re compose the docker container by passing an env variable to it????
I put two models in the same endpoint, then only one was running,
Sorry I wanted to say "service id"
Same service-id but different endpoints
When you spin the model you can tell it any additional packages you might need
What does spin mean in this context?
clearml-serving ...
?
but is it true that I can have multiple models on the same docker instance with different endpoints?
I don't understand the link between service id-s, service tasks and docker containers