Reputation
Badges 1
10 × Eureka!Thank you @<1523701083040387072:profile|UnevenDolphin73> , regarding the clearml.conf
- I don’t think I have access to that config for the agent in the autoscaler as I use a standard docker image.
I tried to make changes in my local clearml.conf
on my laptop but that doesn’t seem to effect the “runner” in the autoscaler ?
It seems that it's possible to provide some configuration parameters when creating the instance through the webui. Which seems to be a reasonable solution for secrets that you want to keep for a longer time.
I tried the following AMI:
ami-0a4f5a73cdd47fd59
and ami-0dacd2425b81201fb
Error: An error occurred (InvalidAMIID.NotFound) when calling the RunInstances operation: The image id '[ami-0a4f5a73cdd47fd59]' does not exist
Error: An error occurred (InvalidAMIID.NotFound) when calling the RunInstances operation: The image id '[ami-0dacd2425b81201fb]' does not exist
![image](https://clearml-web-assets.s3.amazonaws.com/scoold/images/TT9ATQXJ5-F04QMMDEDAB/image.png...
My next issue is now to create the autoscaler via this script . The script runs through and I see a task which finishes successfull.
But I can't find the autoscaler anywhere on the WebUi.
Also this message suggests that I can change the configuration, but as said I can't find it anywhere and wouldn't know hot to change the configuration.
print("AWS Autoscaler setup wizard\n"
...
Not quite sure how to proceed. Any suggestion how a working combination would look like would be appreciated. Also those NVIDIA AMI seem to be mainly for large instances with GPU not sure if it's possible to run them also on a CPU? @<1523701205467926528:profile|AgitatedDove14>
Thanks @<1523701083040387072:profile|UnevenDolphin73> , I realized that with a specified AMI it works a bit better. I tried with this one: ami-0735c191cf914754d
; which seems to be one of the standard AMIs. But also in that case the instance just freezes:
I also tried: ami-0f1a5f5ada0e7da53
which is [ Amazon Linux 2 AMI (HVM) - Kernel 5.10, SSD Volume Type ]
2023-02-23 22:10:16,862 - clearml.Auto-Scaler - INFO - Autoscaler started
2023-02-23 22:10:16,862 - clearml.Auto-Scaler ...
Okay, I see the picture below is that what you referring to? That experiment says it's completed, does it mean that the autoscaler is running or not? For me it sounds like the starting of the service is completed but I don't really see if the autoscaler is actually running. Also I don't see any output in the console of the autoscaler.
Okay great i think i got it to work now with this AMI: ami-0c17f9e857dbd4c40
and with the python:3.10.10-bullseye
dockerimage
Unfortunately it does not support changing the configuration "live"
That's okay, that's not so important to me. I'm mainly interested to see how many autoscaler I have currently active and which one I have. But in the application tab I only see the ones that I created online:
I don't seem to be able to track the ones that I created with that script. Do I understand something wrong?
Okay can I somehow query how many manually/scripted created autoscaler I have and how would I delete them again? Is there a way to query the status and potentially some console output of those manually/scripted created autoscaler?