Reputation
Badges 1
212 × Eureka!yea let me unwind some changes so I can pinpoint the issue
Figured this out, the value is parsed from my local clearml.conf file
err maybe not, I dont know where its being fetched
It will then parse the above information from my local workstation?
AgitatedDove14 note the missing brackets https://github.com/allegroai/clearml-helm-charts/blob/main/charts/clearml-agent/templates/agentk8sglue-deployment.yaml#L22
Can you fix this or should I open a PR. I'm blocked by this.
You guys are the maintainers of this repo
Seems like its just missing the brackets
I dont know how to do that
I made the PR here JuicyFox94 AgitatedDove14 https://github.com/allegroai/clearml-helm-charts/pull/106
Yea that is a similar bug, needs the same fix
Yes! Thanks so much for the quick turnaround
I think the quotes don't effect the yaml
Do you want me to PR that fix?
For instance, quotes are used
No I'm not tracking. I'm pretty new to k8s so this might be beyond my current knowledge. Maybe if I rephrase my goals it may make more sense. Essentially I want to enqueue an experiment, pick a queue (gpu), and have a gpu ec2 node provisioned upon that, lastly the experiment is then initialized on that new gpu ec2 and executed. When the work is completed, I want the gpu ec2 node to terminate after x amount of time.
Also how do I provide the k8 glue agent permissions to spin up/down ec2 nodes?
Would I copy and paste this block to produce another queue and k8 glue agent?
Are you able to do screenshare to discuss this? I'm not sure I understand the k8 glue agent purpose.
I got everything working using the default queue. I can submit an experiment, and a new GPU node is provisioned, all good
yes, I see in the UI how to create a new queue. How do I associate that queue with a nodeSelector though?