the second seems like a botocore issue :
https://github.com/boto/botocore/issues/2187
AgitatedDove14 This seems to be consistent even if I specify the absolute path to /home/user/trains.conf
region is empty, I never entered it and it worked
JitteryCoyote63 what am I missing?
What are the errors you are getting (with / without the envs)
but I also make sure to write the trains.conf to the root directory in this bash script:echo " sdk.aws.s3.key = *** sdk.aws.s3.secret = *** " > ~/trains.conf ... python3 -m trains_agent --config-file "~/trains.conf" ...
I'll try to pass these values using the env vars
I will probably just use everywhere an absolute path to be robust against different machine user accounts: /home/user/trains.conf
JitteryCoyote63 when the agent is running a job, it prints its configuration at the beginning, do you see the correct credentials there (you will not see the secret but you will see the access key)
without the envs, I had error: ValueError: Could not get access credentials for '
s3://my-bucket ' , check configuration file ~/trains.conf
After using envs, I got error: ImportError: cannot import name 'IPV6_ADDRZ_RE' from 'urllib3.util.url'
AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY AWS_DEFAULT_REGION
(btw, yes I adapted to use Task.init(...output_uri=)
File "devops/valid.py", line 80, in valid(parse_args) File "devops/valid.py", line 41, in valid valid_task.output_uri = args.artifacts File "/data/.trains/venvs-builds/3.6/lib/python3.6/site-packages/trains/task.py", line 695, in output_uri ", check configuration file ~/trains.conf".format(value)) ValueError: Could not get access credentials for 's3://ml-artefacts' , check configuration file ~/trains.conf
I'm so glad you mentioned the cron job, it would have taken us hours to figure
Import Error sounds so out of place it should not be a problem :)
AgitatedDove14 That's a good point: The experiment failing with this error does show the correct aws key:... sdk.aws.s3.key = ***** sdk.aws.s3.region = ...
So most likely trains was masking the original error, it might be worth investigating to help other users in the future
What's the exact error you are getting ?
(Maybe this is privilege error on the cache folder, what are the folders it is using, you can see in the configuration as well)
JitteryCoyote63 see if upgrading the packages as they suggest somehow fixes it.
I have the feeling this is the same problem (the first error might be trains masking the original error)
Yes, hopefully they have a different exception type so we could differentiate ... :) I'll check
So the problem comes when I domy_task.output_uri = "
s3://my-bucket , trains in the background checks if it has access to this bucket and it is not able to find/ read the creds
JitteryCoyote63 are you calling to:my_task.output_uri = "
s3://my-bucket
in the code itself ?
Why not with Task.init output_uri=...
Also this is running remotely there is no need fo r that, use the Execution -> Output -> Destination and put it there, it will do everything for you 🙂
AgitatedDove14 Yes exactly, I tried the fix suggested in the github issue urllib3>=1.25.4
and the ImportError disappeared 🙂
I will probably just use everywhere an absolute path to be robust against different machine user accounts: /home/user/trains.conf
That sounds like good practice
Other than the wrong, trains.conf, I can't think of anything else... Well maybe if you have AWS environment variables with credentials ? they will override the conf file
After some investigation, I think it could come from the way you catch error when checking the creds in trains.conf: When I passed the aws creds using env vars, another error poped up: https://github.com/boto/botocore/issues/2187 , linked to boto3
And I can verify that ~/trains.conf exists in the su home folder