If the configurations and hyper params still appear properly in the task there's no need to rerun the wizard. just make sure you're using the updated trains repo
Hi FriendlySquid61 I did all the changes you said
now I get this error in my Auto Scaler taskWarning! exception occurred: An error occurred (AuthFailure) when calling the RunInstances operation: AWS was not able to validate the provided access credentials Retry in 15 seconds
Now I remind you that using the same credentials exactly, the auto scaler task could launch instances before
I doubled checked the credentials in the configurations, and they have full EC2 access
Hey WackyRabbit7 ,
Is this the only error you have there?
Can you verify the credentials in the task seem ok and that it didn't disappear as before?
Also, I understand that the Failed parsing task parameter ...
warnings no longer appear, correct?
and when looking at the running task, I still see the credentials
I have them in two different places, once under Hyperparameters -> General
and also in the extra_vm_bash_script
variables, I ahve them under export TRAINS_API_ACCESS_KEY
and export TRAINS_API_SECRET_KEY
Those are different credentials.
You should have the aws info under:cloud_credentials_key
, cloud_credentials_secret
and cloud_credentials_region
And the stuff added to the extra_vm_bash_script
are the trains key and secret from your profile page in the UI.
I suggest you use the wizard again to run the task, this will make sure all the data is where it should be.
no need to do it again, I ahve all the settings in place, I'm sure it's not a settings thing
So just to correct myself and sum up, the credentials for AWS are only in the cloud_credentials_*
So what's our next move FriendlySquid61 ? O_O
Searching this error it seems it could be many things.
Either wrong credentials or a wrong region (different than the one for your key-pair).
It could also be that your computer clock is wrong (see example https://github.com/mitchellh/vagrant-aws/issues/372#issuecomment-87429450 ).
I suggest you search it online and see if it solves the issue, I think it requires some debugging on your end.
Actually I removed the key pair, as you said it wasn't a must in the newer versions
can you tell me which API call exactly are you using for spinning up? I would like to debug and try to use boto3
myself in order to spin up an instance, so I can understand where the problem is coming from
Sure, we're using RunInstances
, you can see the call itself https://github.com/allegroai/trains/blob/master/trains/automation/aws_auto_scaler.py#L163
Actually I removed the key pair, as you said it wasn't a must in the newer versions
It isn't a must, but if you are using one, it should be in the same region