JuicyFox94 since I have you, the connection issue might be caused by the istio proxy. In order to disable the istio sidecar injection I must add an annotation to the pod.
https://github.com/allegroai/clearml-helm-charts/blob/main/charts/clearml-agent/templates/agentk8sglue-configmap.yaml#L8
Unfortunately there does not seem to be any field for that in the values file.
yes that is possible but I do use istio for the clearml server components. I can move the agents to a separate namespace. I will try that
If I'm not wrong you can simply label the namespace to avoid istio to get there
I will try to fix that. But what is the purpose of the 'k8s_scheduler' queue?
itβs a queue used by the agent just for internal scheduling purposes
when tasks starts do you see clearml-id-* pod starting?
So it seems it starts on the queue I specify and then it gets moved to the k8s_scheduler queue.
So the experiment starts with the status "Running" and then once moved to the k8s_scheduler queue it stays in "Pending"
but I think this behaviour will hange in future releases
actually it does not because the pods logs show .
at task completion do you get state Completed in UI?
ok, i'll try to fix the connection issue. Thank you for the help π