I was trying to test the autoscaler feature, but I am getting the following error:
2022-10-21 02:06:43,599 - clearml.Auto-Scaler - INFO - Failed to start new instance (resource 'DefaultGPUInstance'), Error: Parameter validation failed: Invalid type for parameter ImageId, value: None, type: <class 'NoneType'>, valid types: <class 'str'>No idea what the ImageId actually is.

Solved by removing default parts.

Now I got a strange behavior in which I have 2 tasks on queue, the autoscaler fires two EC2 instances and then turn them off without running the tasks, then It fires two new instances again in a loop.

I used the autogenerated clearml.conf, I will try erasing the unnecessary parts.

Hi SkinnyPanda43

No idea what the ImageId actually is.

That's the ami image string that the new EC2 will be started with, make sense ?

Yes I suspect it is too large 😞
Notice that most parts have default values so there is no need to specify them

SkinnyPanda43 could it be the clearml.conf is too large? how come it exceeds 16kb ?
Any hint on how you start the AWS autoscaler ?

Can you share the log?

Thank you, I have defined the AMI manually instead of using the default, now I am getting the following error:

Error: An error occurred (InvalidParameterValue) when calling the RunInstances operation: User data is limited to 16384 bytes

