Nice! TrickySheep9 any chance you can share them ?
but I'd prefer to have a new instance deployed for each new experiment and that it also terminates when no new experiments are queued
I'm not objecting, just wondered on the rational behind the decision π
Back to the AWS autoscaler:
Basically if you have the services-agent running on your cluster, it will just run the aws-autoscaler for you π
The idea of the service-agent is to run logic/monitoring Tasks suck as the aws autoscaler. Notice that service-mode means multiple job per agent, contrary to the default one task per agent at any given time.
If you want you can package the aws-autoscaler example inside a docker and just spin it, you can use the clearml-agent docker file as a good starting point. wdyt?
Hi AgitatedDove14 , do you mean the the k8s glue autoscaler here https://github.com/allegroai/clearml-agent/blob/master/examples/k8s_glue_example.py ? If yes, I understood that this service deploys pods on the nodes in the cluster, but I'd prefer to have a new instance deployed for each new experiment and that it also terminates when no new experiments are queued
This, however, requires that I slightly modify the clearml helm chart with the aws-autoscaler deployment, right?
Correct π
AgitatedDove14 that seems like the best option. Once the aws autoscaler is inside a docker container I can deploy it inside a kube pod or a job. This, however, requires that I slightly modify the clearml helm chart with the aws-autoscaler deployment, right?
I just run the k8s daemon with a simple helm chart and use it with terraform with the helm provider. Nothing much to share as itβs just a basic chart π
I use a custom helm chart and terraform helm provider for these things
My goal is to automatically run the AWS Autoscaler task on a clearml-agent pod when I deploy
LovelyHamster1 this is very cool!
quick question, if you are running on EKS, why not use the EKS autoscaling instead of the ClearML aws EC2 autoscaling ?