but I can be wrong, give me 30 mins while I recreate same local installation with samecchart so I can see if something is wrong
Hi JuicyFox94 , thank you for a reply. Yes, should be on the same cluster. My steps were:
- created cluster via kind as written in https://github.com/allegroai/clearml-helm-charts/tree/main/charts/clearml
helm install clearml-server allegroai/clearml
- Then I generated new API key, downloaded https://github.com/allegroai/clearml-helm-charts/blob/main/charts/clearml-agent/values.yaml file and overwritten values:
# -- Reference to Api server url apiServerUrlReference: "
` "
-- Reference to File server url
fileServerUrlReference: " "
-- Reference to Web server url
webServerUrlReference: " "
-- Agent k8s Glue basic auth key
agentk8sglueKey: "RYKRKARI6PLDY11653OP"
-- Agent k8s Glue basic auth secret
agentk8sglueSecret: "i7B8jaOfQawvQXM0VZdfylTKhn2n5EPogkoIscPT6aUZd0yMM7" 4. install clearml agent via
helm install --values agent-values.yml clearml-agent allegroai/clearml-agent `
Hi Tom; letsβ try to debug. Did you install all the charts in same namespace? Did you generate a key/secret pair from UI and the use them just in agent and serving chart?
SureNAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION clearml-agent default 1 2022-10-24 13:18:32.846539589 +0200 CEST deployed clearml-agent-2.0.1 1.24 clearml-server default 1 2022-10-24 12:57:44.160396437 +0200 CEST deployed clearml-4.3.0 1.7.0
Now gettingNo resources found in clearml namespace.
you are right, its defaultNAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE clearml-elastic-master ClusterIP 10.96.188.164 <none> 9200/TCP,9300/TCP 25m clearml-elastic-master-headless ClusterIP None <none> 9200/TCP,9300/TCP 25m clearml-server-apiserver NodePort 10.96.73.213 <none> 8008:30008/TCP 25m clearml-server-fileserver NodePort 10.96.43.203 <none> 8081:30081/TCP 25m clearml-server-mongodb ClusterIP 10.96.230.210 <none> 27017/TCP 25m clearml-server-redis-headless ClusterIP None <none> 6379/TCP 25m clearml-server-redis-master ClusterIP 10.96.110.248 <none> 6379/TCP 25m clearml-server-webserver NodePort 10.96.43.251 <none> 80:30080/TCP 25m kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 25m
I guess apiServerUrlReference should be fixed
Checking old yaml I used last week and I had the URLs correct but I guess I missconfigured something else.
Sure, thank you very much! Working flawlessly now π
pls fix also fileServerUrlReference: anf webServerUrlReference:
feel free to ping here if anything else is needed
sorry but did you deployed in clearml
namesopace?
Oh yes, fixed that and it's working! Thank you!
can you pls share output of helm list
in clearml namespace ?
Not sure if those are all necessary steps or I am missing some additional configuration