and then use them in agents if they are external
In this case I suggest to give a try to k8s-glue that is there by default in latest chart version
not urgent after we used the workaround
Hey ApprehensiveSeahorse83 , I didn’t forget about you, it’s just a busy time for me; will answer during the day after a couple of more tests on my testing env.
regardless of this I probably need to add some more detailed explanations on credentials configs
still need time because I have two very busy days
for now we used fixed number of cpu agents but it will be better if it was dynamic with glue agent
but I will try to find something good for you
Not sure I understand, you are saying I should not create user credentials and add them in values.yaml at secret.credentials.apiserver and secret.credentials.tests. ?
I’m going to investigate this specific use case and will get back to you
Hi ApprehensiveSeahorse83 , today we released clearml-agent
chart that just installs glue agent. My suggestion is to disable k8s glue and any other agent from the clearml
chart and install more than one clearml-agent
chart in different namespaces. In this way you will be able to have k8s glue for every queue (cpu and gpu).
Interesting use case, maybe we can create multiple k8s agents for different queues
O k, I’d like to test it more with you; credentials exposed in chart values are system ones and it’s better to not change them; let’s forget about them for now. If you create a new accesskey/secretkey pair in ui, you should use these ones in your agents and they shuld not get overwritten in any way; can you confirm it works without touching credentials
section?
if they are in kubernetes you can simply use k8s glue
but the system account key and secret can’t be the same for every installation, no? i need to generate specific one for my installation, no?
since the gpu is expensive we want the glue to manage the pods
we already using glue to manage our gpu pods. The agents we use for the pipelines are simple cpu agent.
can we use multiple k8s-glue - one for cpu and one for gpu pods?
For now we used a workaround and forked the helm charts repo and we changed in the agents deployment.yaml, instead of taking the key and secret from the clearml-conf secret we take them from another secret we created so the server does not “know” about this new key and secret and does not reset them