Could you try adding region
under credentials
as well?
@<1523701435869433856:profile|SmugDolphin23> @<1523701087100473344:profile|SuccessfulKoala55>
2023-02-03 20:38:14,515 - clearml.metrics - WARNING - Failed uploading to <my-endpoint> (HTTPSConnectionPool(host=' e ndpoint', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)'))))
2023-02-03 20:38:14,517 - clearml.metrics - ERROR - Not uploading 1/2 events because the data upload failed
@<1523701435869433856:profile|SmugDolphin23> I actually don't know where to get my region for the creds to S3 I am using. From what I figured, I have to plug in my sk, ak and bucket into credentials in agent and output URI must be my S3 endpoint — complete URI with protocol. Is it correct?
Oh, it's configured o agent machine, got you
OK. Bt the way, you can find the region from the AWS dashabord
@<1523701087100473344:profile|SuccessfulKoala55> Could you provide a sample of how to properly fill all the necessary config values to make S3 work, please?
My endpoint starts with https://
and I don't know what my region is, endpoint URL doesn't contain it.
Right now I fill it like this:
aws.s3.key = <access-key>
aws.s3.secret = <secret-key>
aws.s3.region = <blank>
aws.s3.credentials.0.bucket = <just_bucket_name>
aws.s3.credentials.0.key = <access-key>
aws.s3.credentials.0.secret = <secret-key>
sdk.development.default_output_uri = <
>
@<1523701304709353472:profile|OddShrimp85> I fixed my SSL error by putting REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt
in .bashrc
file
The only expection is the models if I'm not mistaken, which are stored locally by default.
` from random import random
from clearml import Task, TaskTypes
args = {}
task: Task = Task.init(
project_name="My Proj",
task_name='Sample task',
task_type=TaskTypes.inference,
auto_connect_frameworks=False
)
task.connect(args)
task.execute_remotely(queue_name="default")
value = random()
task.get_logger().report_single_value(name="sample_value", value=value)
with open("some_artifact.txt", "w") as f:
f.write(f"Some random value: {value}\n")
task.upload_artifact(name="test_artifact", artifact_object="some_artifact.txt") `
@<1526734383564722176:profile|BoredBat47> Just to check if u need to do update-ca-certificates or equivalent?
@<1523701087100473344:profile|SuccessfulKoala55> Hey, Jake, getting back to you. I couldn't be able to resolve my issue. I can access my bucket by any means just fine, e.g. by S3 CLI client. All the tools I use require 4 params: AK, SK, endpoint, bucket. I wonder why ClearML doesn't have explicit endpoint
parameter and you have to use output_uri
for it and why is there a region
when other tools don't require it.
` s3 {
# S3 credentials, used for read/write access by various SDK elements
# default, used for any bucket not specified below
key: "mykey"
secret: "mysecret"
region: " ` ` "
credentials: [
{
bucket: "mybucket"
key: "mykey"
secret: "mysecret"
region: " ` ` "
}, `
SmugDolphin23 Got it. Now I am a bit confused about region parameter in s3 section. Amazon docs say that region could be a regular URL with protocol like https://etc.etc which my endpoint actually is. I plugged it in s3 section in clearml.conf. Should it stay that way?
@<1523701435869433856:profile|SmugDolphin23> Thanks a lot, that actually worked! It was very difficult to figure out you have to plug those exact values given you have https endpoint:
- Using s3 protocol instead of https together with bucket name in output URI
- Not providing a bucket name in credentials section where it is by default
- Providing default secure port for both host and output URI
- Disabling credentials chainI think a common use case for many people that they get S3 storage with integrated Amazon solution where they are provided with region and a bucket name. Together with access key it's sufficient to connect to their cloud. But a lot of people, especially in enterprise have a case like mine where they have https endpoint to their company hosted S3 solution so I think it would be great to reflect that case in documentation so other people would have easier time to configure https endpoints for clearml-agent. Another thing would be nice to have is to support endpoint parameter under S3 section of clearml.conf which if provided as is (with https and no port) is sufficient to connect to S3 bucket. That would require some coding and rewriting URL constructing methods and maybe boto3 calls (I peeked inside a code and would say some places regarding this issue were questionable e.g. init method in _Container class in helper.py). I would try to fix it myself and make a pull request if working schedule lets me but I can't make a promise on that.
@<1523701070390366208:profile|CostlyOstrich36> @<1523701087100473344:profile|SuccessfulKoala55> H. Thank you too for helping! Would be great if you'd try to look at the issue I discussed in this message.
Good luck, guys!
You need to specify it. Or you could specify this in your config: https://github.com/allegroai/clearml/blob/54c601eea2f9981bb8e360a8203bc36696a55cfd/clearml/config/default/sdk.conf#L164
@<1523701304709353472:profile|OddShrimp85> I haven't done it, for me it worked as-is
SmugDolphin23 Thank you very much!
That's clearml.conf for ClearML end users right?
Yeah, that's always the case with complex systems 😕
Hi again, @<1526734383564722176:profile|BoredBat47> ! I actually took a closer look at this. The config file should look like this:
s3 {
key: "KEY"
secret: "SECRET"
use_credentials_chain: false
credentials: [
{
host: "myendpoint:443" # no http(s):// and no s3:// prefix, also no bucket name
key: "KEY"
secret: "SECRET"
secure: true # if https
},
]
}
default_output_uri: "
" # notice the s3:// prefix (not http(s))
The region should be optional, but try setting it as well if it doesn't work
@<1526734383564722176:profile|BoredBat47> How would you connect with boto3
? ClearML uses boto3
as well, what it basically does is getting the key/secret/region from the conf file. After that it opens a Session
with the credentials. Have you tried deleting the region altogether from the conf file?
The code is run from another machine where clearml.conf configured to connect to ClearML server, no other configurations are provided
A bit overwhelmed by configuration, since it has an agent, a server and bunch of configuration files, easy to mess up
SmugDolphin23 Sorry to bother again, output_uri should be a URI to S3 endpoint or clear ml fileserver? If it's not provided artifacts are stored locally, right?
@<1526734383564722176:profile|BoredBat47> the bucket name in your case should just be somebucket
(and should not start with s3://
)
@<1523701087100473344:profile|SuccessfulKoala55> Fixed it by setting env var with path to certificates. I was sure that wouldn't help since I can curl and python get request to my endpoint from shell just fine. Now it says I am missing security headers, seems it's something on my side. Will try to fix this
I think that will work, but I'm not sure actually. I know for sure that something like us-east-2
is supported