
Reputation
Badges 1
104 × Eureka!@<1722061389024989184:profile|ResponsiveKoala38> It fixed the issue!
SmugDolphin23 Thank you very much!
That's clearml.conf for ClearML end users right?
SmugDolphin23 Got it. Now I am a bit confused about region parameter in s3 section. Amazon docs say that region could be a regular URL with protocol like https://etc.etc which my endpoint actually is. I plugged it in s3 section in clearml.conf. Should it stay that way?
Thanks a lot. I see that ClearML apiserver is up for 7 months, could it be that it runs on a version that was recent 7 month ago?
@<1523701070390366208:profile|CostlyOstrich36> You mean using port in credentials.host
?
Versions in compose are:
image: allegroai/clearml:1
image: elasticsearch:7.6.2
image: mongo:4.4.9
I am not quite sure that backups were made on those versions. Is there a way to see service versions from backup?
@<1523701070390366208:profile|CostlyOstrich36> I understand but the description of the error seems to indicate not about database conflicts but about connectivity to elastic by apiserver. I couldn't find info about this on the internet. I think I ruled out incosistent image versions. Are there any more suggestions? Thanks.
@<1722061389024989184:profile|ResponsiveKoala38> Hello. It seems that it didn't work for me. I made a backup, moved it to another machine and tried to run clearml service (latest docker compose). Now, I have async-delete, apiserver, mongo, fileserver, elastic constantly restarting
A bit overwhelmed by configuration, since it has an agent, a server and bunch of configuration files, easy to mess up
My current setup is:
sdk.development.default_output_uri=< None > # no port, no bucket
sdk.aws.s3.key=<my-access-key>
sdk.aws.s3.secret=<my-secret-key>
sdk.aws.s3.region=<my-region> # I think it can be skipped but somewhere in the clearml code it says that it must be specified if it's not default like us-east-1 or something
sdk.aws.s3.credentials.bucket=<my-bucket> # just a bucket name
sdk.aws.s3.credentials.host=< None : 443> # the same as output...
session = boto3.Session(
aws_access_key_id=self.access_key,
aws_secret_access_key=self.secret_key)
He tried to help me in another thread but I still couldn't make things work
My question could be this: what's get plugged into endpoint_url in boto3 client inside ClearML?
@<1523701435869433856:profile|SmugDolphin23> @<1523701087100473344:profile|SuccessfulKoala55>
2023-02-03 20:38:14,515 - clearml.metrics - WARNING - Failed uploading to <my-endpoint> (HTTPSConnectionPool(host=' e ndpoint', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)'))))
2023-02-03 20:38:14,517 - clearml.metrics - ERROR - Not uploa...
@<1523701087100473344:profile|SuccessfulKoala55> Could you provide a sample of how to properly fill all the necessary config values to make S3 work, please?
My endpoint starts with https://
and I don't know what my region is, endpoint URL doesn't contain it.
Right now I fill it like this:
aws.s3.key = <access-key>
aws.s3.secret = <secret-key>
aws.s3.region = <blank>
aws.s3.credentials.0.bucket = <just_bucket_name>
aws.s3.credentials.0.key = <access-key>
aws.s3.credentials.0.secret ...
` from random import random
from clearml import Task, TaskTypes
args = {}
task: Task = Task.init(
project_name="My Proj",
task_name='Sample task',
task_type=TaskTypes.inference,
auto_connect_frameworks=False
)
task.connect(args)
task.execute_remotely(queue_name="default")
value = random()
task.get_logger().report_single_value(name="sample_value", value=value)
with open("some_artifact.txt", "w") as f:
f.write(f"Some random value: {value}\n")
task.upload_artifact(name="test...
@<1523701070390366208:profile|CostlyOstrich36>
Should I leave as is or fill the values in docker-compose for agent-services? I set it to localhost since agent-services is running together with other clearml containers on one machine. Not sure why do you have to fill those values.
CLEARML_HOST_IP: "<my_clearml_server_ip>"
CLEARML_WEB_HOST: " None "
CLEARML_API_HOST: " None "
CLEARML_FILES_HOST: " [None](http://127.0.0.1...
Can a problem be that backups are made while ClearML was running, not stopped, like docs suggest? @<1523701070390366208:profile|CostlyOstrich36>
It's the same request you provided just without "case_sensitive" option and with my endpoints @<1722061389024989184:profile|ResponsiveKoala38>
@<1523701435869433856:profile|SmugDolphin23> Thanks a lot, that actually worked! It was very difficult to figure out you have to plug those exact values given you have https endpoint:
- Using s3 protocol instead of https together with bucket name in output URI
- Not providing a bucket name in credentials section where it is by default
- Providing default secure port for both host and output URI
- Disabling credentials chainI think a common use case for many people that they get S3 storage wi...
@<1523701087100473344:profile|SuccessfulKoala55> Fixed it by setting env var with path to certificates. I was sure that wouldn't help since I can curl and python get request to my endpoint from shell just fine. Now it says I am missing security headers, seems it's something on my side. Will try to fix this
@<1523701087100473344:profile|SuccessfulKoala55> Right
@<1722061389024989184:profile|ResponsiveKoala38> Hello. What if my old fileserver address was not matching the None scheme? It was http and didn't have a domain, only ip address. Should I put my old address as it was in the replace method?
@<1523701435869433856:profile|SmugDolphin23> I didn't use a region at first and that was not working. Now I use a region and it still doesn't work.
From the boto3 inside a Python I could create a session where I specify ak and sk, and create a client from the session where I pass service_name and endpoint_url. It works just fine
SmugDolphin23 That fixed the issue, thank you very much!
` % Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 100k 100 100k 0 0 10236 0 0:00:10 0:00:10 --:--:-- 21354
Warning: Transient problem: HTTP error Will retry in 10 seconds. 10 retries
Warning: left.
100 100k 100 100k 0 0 10237 0 0:00:10 0:00:10 --:--:-- 21345
Warning: Transient problem: HTTP error Will retry in 10 seconds. 9 retries
Warning: left...
clearml 1.9.0
clearml-agent 1.5.1
NAME="Ubuntu"
VERSION="18.04.6 LTS (Bionic Beaver)"