If the configurations and hyper params still appear properly in the task there's no need to rerun the wizard. just make sure you're using the updated trains repo
can you tell me which API call exactly are you using for spinning up? I would like to debug and try to use boto3
myself in order to spin up an instance, so I can understand where the problem is coming from
So what's our next move FriendlySquid61 ? O_O
I doubled checked the credentials in the configurations, and they have full EC2 access
now I get this error in my Auto Scaler taskWarning! exception occurred: An error occurred (AuthFailure) when calling the RunInstances operation: AWS was not able to validate the provided access credentials Retry in 15 seconds
Now I remind you that using the same credentials exactly, the auto scaler task could launch instances before
Update - New AWS credentials solved this issue.
Searching this error it seems it could be many things.
Either wrong credentials or a wrong region (different than the one for your key-pair).
It could also be that your computer clock is wrong (see example https://github.com/mitchellh/vagrant-aws/issues/372#issuecomment-87429450 ).
I suggest you search it online and see if it solves the issue, I think it requires some debugging on your end.
FriendlySquid61
Just updating, I still haven't touched this.... I did not consider the time it would take me to set up the auto scaling, so I must attend other issues now, I hope to get back to this soon and make it work
Make sure you're testing it on the same computer the autoscaler is running on
and also in the extra_vm_bash_script
variables, I ahve them under export TRAINS_API_ACCESS_KEY
and export TRAINS_API_SECRET_KEY
Hi FriendlySquid61 I did all the changes you said
no need to do it again, I ahve all the settings in place, I'm sure it's not a settings thing
So just to correct myself and sum up, the credentials for AWS are only in the cloud_credentials_*
Actually I removed the key pair, as you said it wasn't a must in the newer versions
It isn't a must, but if you are using one, it should be in the same region
Sure, we're using RunInstances
, you can see the call itself https://github.com/allegroai/trains/blob/master/trains/automation/aws_auto_scaler.py#L163
Those are different credentials.
You should have the aws info under:cloud_credentials_key
, cloud_credentials_secret
and cloud_credentials_region
And the stuff added to the extra_vm_bash_script
are the trains key and secret from your profile page in the UI.
I suggest you use the wizard again to run the task, this will make sure all the data is where it should be.
Actually I removed the key pair, as you said it wasn't a must in the newer versions
I have them in two different places, once under Hyperparameters -> General
Hey WackyRabbit7 ,
Is this the only error you have there?
Can you verify the credentials in the task seem ok and that it didn't disappear as before?
Also, I understand that the Failed parsing task parameter ...
warnings no longer appear, correct?
and when looking at the running task, I still see the credentials