Reputation
Badges 1
88 × Eureka!@<1523701304709353472:profile|OddShrimp85> I haven't done it, for me it worked as-is
SmugDolphin23 I added a region, run experiment again. Didn't work
@<1523701435869433856:profile|SmugDolphin23> Hello, again! I tried to fill the values by your example. Still no luck. I noticed console log on my task says that I have certificate error. I disabled it in api section in clearml.conf like this: verify_certificate = false
and I still have SSL error. Any clues why would that be?
` s3 {
# S3 credentials, used for read/write access by various SDK elements
# default, used for any bucket not specified below
key: "mykey"
secret: "mysecret"
region: " ` ` "
credentials: [
{
bucket: "mybucket"
key: "mykey"
secret: "mysecret"
region: " ` ` "
}, `
@<1523701087100473344:profile|SuccessfulKoala55> Could you provide a sample of how to properly fill all the necessary config values to make S3 work, please?
My endpoint starts with https://
and I don't know what my region is, endpoint URL doesn't contain it.
Right now I fill it like this:
aws.s3.key = <access-key>
aws.s3.secret = <secret-key>
aws.s3.region = <blank>
aws.s3.credentials.0.bucket = <just_bucket_name>
aws.s3.credentials.0.key = <access-key>
aws.s3.credentials.0.secret ...
@<1523701087100473344:profile|SuccessfulKoala55> Hey, Jake, getting back to you. I couldn't be able to resolve my issue. I can access my bucket by any means just fine, e.g. by S3 CLI client. All the tools I use require 4 params: AK, SK, endpoint, bucket. I wonder why ClearML doesn't have explicit endpoint
parameter and you have to use output_uri
for it and why is there a region
when other tools don't require it.
Oh, it's configured o agent machine, got you
482e96243041 allegroai/clearml:latest "python3 -m jobs.asy…" 18 months ago Up 7 weeks 8008/tcp, 8080-8081/tcp async_delete
26c677f2b70f allegroai/clearml:1 "/opt/clearml/wrappe…" 18 months ago Up 16 months 8008/tcp, 8080-8081/tcp, 0.0.0.0:8080->80/tcp, :::8080->80/tcp clearml-webserver
- `7e2cf4462f44 allegroai/clearml:1 "/opt/clearml/wrappe…" 18 months ago Up 7 months 0.0.0.0:8008->8008/tcp, :::8008->8008/tcp, 8080-8081/tcp clearml-apiserv...
Thanks a lot. I see that ClearML apiserver is up for 7 months, could it be that it runs on a version that was recent 7 month ago?
So, right now I have old deployment. It's working good, it's not corrupted. Service versions I shared above (output of docker ps). My goal is to move everything to another machine. Yes, I want to have a new deployment with all previous data. Basically, it's backup and restore task. The problem was that old docker compose file doesn't work as is. Maybe because when I run it on a new machine clearml:1 is pulling the latest version and elastic version is set to one that is no longer supported.
@<1722061389024989184:profile|ResponsiveKoala38> Thank a lot! I am gonna upgrade ClearML using this link: None
It seems that only async_delete container is using the lastest version
Hi @<1722061389024989184:profile|ResponsiveKoala38> , I am using those specific versions because my previous ClearML installation runs with such versions, they are in docker compose file. Version of ClearML image is 1. Afaik the latest is 1.16.2. My goal is to move ClearML to a different machine so I need to stick to those versions
@<1523701435869433856:profile|SmugDolphin23> Thanks a lot, that actually worked! It was very difficult to figure out you have to plug those exact values given you have https endpoint:
- Using s3 protocol instead of https together with bucket name in output URI
- Not providing a bucket name in credentials section where it is by default
- Providing default secure port for both host and output URI
- Disabling credentials chainI think a common use case for many people that they get S3 storage wi...
Yeah, I mean fresh installation using old docker compose file. Just without backups (/clearml/data). So it seems the solution to me should be:
- Migrate to the latest version of elastic on old installation
- Make a backup
- Deploy latest ClearML installation with that backup
@<1523701435869433856:profile|SmugDolphin23> @<1523701087100473344:profile|SuccessfulKoala55>
2023-02-03 20:38:14,515 - clearml.metrics - WARNING - Failed uploading to <my-endpoint> (HTTPSConnectionPool(host=' e ndpoint', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)'))))
2023-02-03 20:38:14,517 - clearml.metrics - ERROR - Not uploa...
He tried to help me in another thread but I still couldn't make things work
@<1523701087100473344:profile|SuccessfulKoala55> Right
My current setup is:
sdk.development.default_output_uri=< None > # no port, no bucket
sdk.aws.s3.key=<my-access-key>
sdk.aws.s3.secret=<my-secret-key>
sdk.aws.s3.region=<my-region> # I think it can be skipped but somewhere in the clearml code it says that it must be specified if it's not default like us-east-1 or something
sdk.aws.s3.credentials.bucket=<my-bucket> # just a bucket name
sdk.aws.s3.credentials.host=< None : 443> # the same as output...
Thanks for the reply! We have a custom S3 server, it has an URL — endpoint like https://<some-domain>.<sub-domain>. I've read in docs that when you provide credentials.host
— port must be specified. @<1523701070390366208:profile|CostlyOstrich36>
@<1523701070390366208:profile|CostlyOstrich36>
@<1523701087100473344:profile|SuccessfulKoala55> So I have to provide a host for it to work and no other way around it?
CostlyOstrich36
The error appears regardless of --foreground tag. This is not full stacktrace, I will provide it with the next message.
clearml 1.9.0
clearml-agent 1.5.1
Ubuntu1 8.04.6 LTS
@<1523701435869433856:profile|SmugDolphin23> I actually don't know where to get my region for the creds to S3 I am using. From what I figured, I have to plug in my sk, ak and bucket into credentials in agent and output URI must be my S3 endpoint — complete URI with protocol. Is it correct?
CostlyOstrich36 Yep, it seems it was the case. I did not provide credentials for API in docker compose. I did that but now agent-services just keeps restarting. I looked into containers logs and it seems to be a proxy error. Why this container is trying to connect somewhere?