Reputation
Badges 1
63 × Eureka!Hi AgitatedDove14
For example version control, A/B testing, shadow testing, rollback etc..
We can use conventional solutions that we already use. Like Helm and Istio.
But for example is there any tools that gives more insight into model performance in production? or gives more knobs and settings specific for AI.
using common microservice deployment frameworks will be limiting.
I found some solutions (KubeFlow, Seldon Core, http://run.ai , MLFlow, …) but haven’t got to look at them really.
Hey Martin.
I’m deploying to k8s and want to know if PVC is necessary for Redis.
So do I have to create a template yaml to be able to use this feature?
I changed them and tried some combinations to no avail...
I assume it should change the retry pattern too. but it doesn't change.. I feel I'm missing something obvious here
yes. Tuple is not valid. like json. only scalar, string, dict or array
no. tried many variations. I'm not sure if it's reading the variable or not. because wait times don't change either
│ urllib3.exceptions.LocationParseError: Failed to parse: ' https://elastic:xxxxxxxxxxxxxxxxx@clearml-elasticsearch-es-http ', label empty or too long │
` │ [2021-05-11 13:35:54,816] [8] [WARNING] [clearml.initialize] Could not connect to ElasticSearch Service. Retry 1 of 4. Waiting for 30sec ...
I can have much more flexibility and security using Kubernetes native approaches. I can host multiple sessions behind a single LB with different host headers etc. A lot of possibilities. 🙂
How is it going to access to actual pod? Is it a headless service?
yep it was unrelated.. sorry
Thanks for your help
also no authentication it seems
Ah I think I understand it now. 🙂
There’s a static number of pod which services are created for…
It’s probably unrelated 😬 I’ll keep you posted.
no. apparently changing these variables causes something to fail.
it is indeed parsing because invalid hocon leads to error. but nothing changes about retry times etc
there's no log of connecting to elasticsearch now 😕
does it mean it is okay?
the api server doesn't come up though. readiness probe fails.
That wouldn't have crossed my mind especially when it's not in the docs.
I think the whole project can be more cloud friendly. I spent a lot of time adapting it to our k8s environment. I am willing also to contribute. I think a roadmap should be created for more k8s integration and then we can start. 🙂
It's still not connecting XD
but parsing is successful 😄
SuccessfulKoala55 I’m getting 405 on api calls with the configuration you proposed. (btw I think USER_KEY is right, with a single underscore)
SuccessfulKoala55 Hi Jake
We didn’t change anything related to gunicorn. Is there any specific thing I can check for?
Also I noticied that it’s not running the gunicorn as a command but loads it in the python code, I don’t think it’s possible to change the threading with env that way.
I’m not using templates for k8s glue. I’m using the default operation mode which uses kubectl run. Should I use templates and specify a service in there to be able to connect to the pods?
I’m using cloud-ready helm chart (with some modifications)
So for api-server deployment:
