The problem is that clearml.conf s3 config doesnt support empty region field, even empty strings crashes it
we use Ceph Storage Cluster, interface to it is the same as S3
I dont get what I have misconfigured.
The only thing I have not added is "region" field in clearml.conf because we literally dont have, its a self hosted cluster.
You can try and replicate this s3 config I have posted earlier.
maybe someone on your end can try to parse such a config and see if they also have the same problem
will it be appended in clearml?
"s3" is part of domain to the host
- is 50GB elastisearch normal? Have you seen it. elsewhere or are we doing something wrong, one thing I think is that we are probably logging too frequently
- Is it possible to somehow clean up this?
I also have noticed that this incident usually happens in the morning at around 6-7AM
Are there maybe some clearnup tasks or backups running on clearml server at those times?
What do you mean by reusing the task for clearml Dataset, got a code example?
We have multiple different projects with multiple people working on each project.
This is our most used code on dataset uploading
Can I do it while i have multiple ongoing training?
Is that supposted to be so? How to fix it?
from docker inspect I can see that allegorai/clearml uses:
"CLEARML_SERVER_VERSION=1.11.0",
"CLEARML_SERVER_BUILD=373"
Image hash:ed05631045c4237f59ad48f477e06dd72274ab67e70d2f9adc489431d1ce75d7
But there are stil some wierd issues, i cannot see the files uploaded in bucket
We dont need a port
"s3" is part of url that is configured on our routers, without it we cannot connect
Is is even known if the bug is fixed on that version?
Specifying it like this, gets me different error:
Exception has occurred: ValueError
- Insufficient permissions (delete failed) for None
botocore.exceptions.ClientError: An error occurred (IllegalLocationConstraintException) when calling the DeleteObject operation: The me-south-1 location constraint is incompatible for the region specific endpoint this request was sent to.
During handling of the above exception, another exception occurred:
File "/home/ma...
We fixed the issue, thanks, had to update everything to latest.
@<1523703436166565888:profile|DeterminedCrab71> Thanks for responding
It was unclear to me that I need to set 443 also everywhere in clearml.conf
Setting s3 host urls with 443 in clearml.conf and also in web UI made it work
Im now almost at the finish line. The last thing that would be great is to fix archived task deletion.
For some reason i have error of missing S3 keys in clearml docker compose logs, the folder / files are not deleted in S3 bucket.
You can see how storage_credentials.co...
- Here is how client side clearml.conf looks like together with the script im using to create the tasks. Uploads seems to work and is fixed thanks to you guys 🙌
I know these keys work, url and everything else works because I use these creds daily
I do notice another strange thing
Agent-services is down because It has no API key to clearm
I see the debug images in fileserver folder