We had a similar problem. Clearml doesnt support data migration (not that I know of)
So you have two ways to fix this:
- Recreate the dataset when its already in Azure
- Edit each elasticsearch database file entry to point to new destination (we did this)
Adding bucket in clearml.conf causes the same error: clearml.storage - ERROR - Failed uploading: Could not connect to the endpoint URL: " None "
![image](https://clearml-web-assets.s3.amazonaws.com/scoold/images/TT...
I know these keys work, url and everything else works because I use these creds daily
we use Ceph Storage Cluster, interface to it is the same as S3
I dont get what I have misconfigured.
The only thing I have not added is "region" field in clearml.conf because we literally dont have, its a self hosted cluster.
You can try and replicate this s3 config I have posted earlier.
good morning, I tried the script you provided and Im getting somewhere
@<1523701070390366208:profile|CostlyOstrich36> Hello, im still unable to understand how to fix this
ok, I found it.
Are S3 links supported?
So from our IT guys i now know that
"s3" part of url is subdomain, we use it in all other libs like boto3 and cloudpathlib, never had any problems
This is where the crash happens inside the clearml Task
@<1523701070390366208:profile|CostlyOstrich36> Still unable to understand what im doing wrong.
We have self hosted S3 Ceph storage server
Setting my config like this breaks task.init
@<1523701070390366208:profile|CostlyOstrich36> Hello John, we are still unable to use clearml with our self hosted s3 CEPH instances, is there any update on the hotfix for 1.14?
also, when uploading artifacts, I see where they are stored on the s3 bucket, but I cant find where the debug images are stored at
Sounds similar to our issue? We have self hosted S3
None
I cant get the conf credentials to work
Specifying it like this gives me:
Exception has occurred: ValueError
Could not get access credentials for ' None ' , check configuration file ~/clearml.conf
Yes, but does add_external_files makes chunked zips as add_files do?
I need the zipping, chunking to manage millions of files
hi, thanks for reaching out. Getting desperate here.
Yes, its self hosted
No, only currently running experiments are deleted (task itself is gone, but debug images and models are present in fileserver folder)
What I do see is some random elastisearch errors popping up from time to time
[2024-01-05 09:16:47,707] [9] [WARNING] [elasticsearch] POST
None ` [status:N/A requ...
@<1523703436166565888:profile|DeterminedCrab71> Thanks for responding
It was unclear to me that I need to set 443 also everywhere in clearml.conf
Setting s3 host urls with 443 in clearml.conf and also in web UI made it work
Im now almost at the finish line. The last thing that would be great is to fix archived task deletion.
For some reason i have error of missing S3 keys in clearml docker compose logs, the folder / files are not deleted in S3 bucket.
You can see how storage_credentials.co...
The incident happened last friday (5 january)
Im giving you logs from around that time