Reputation
Badges 1
123 × Eureka!So from our IT guys i now know that
"s3" part of url is subdomain, we use it in all other libs like boto3 and cloudpathlib, never had any problems
This is where the crash happens inside the clearml Task
@<1523701435869433856:profile|SmugDolphin23> Setting it without http is not possible as it auto fills them back in
py file:
task: clearml.Task = clearml.Task.init(
project_name="project",
task_name="task",
output_uri=" None ",
)
clearml.conf:
{
# This will apply to all buckets in this host (unless key/value is specifically provided for a given bucket)
host: " our-host.com "
key: "xxx"
secret: "xxx"
multipart: false
...
But there are stil some wierd issues, i cannot see the files uploaded in bucket
We dont need a port
"s3" is part of url that is configured on our routers, without it we cannot connect
Is is even known if the bug is fixed on that version?
Specifying it like this, gets me different error:
Exception has occurred: ValueError
- Insufficient permissions (delete failed) for None
botocore.exceptions.ClientError: An error occurred (IllegalLocationConstraintException) when calling the DeleteObject operation: The me-south-1 location constraint is incompatible for the region specific endpoint this request was sent to.
During handling of the above exception, another exception occurred:
File "/home/ma...
The problem is that clearml.conf s3 config doesnt support empty region field, even empty strings crashes it
We fixed the issue, thanks, had to update everything to latest.
@<1523703436166565888:profile|DeterminedCrab71> Thanks for responding
It was unclear to me that I need to set 443 also everywhere in clearml.conf
Setting s3 host urls with 443 in clearml.conf and also in web UI made it work
Im now almost at the finish line. The last thing that would be great is to fix archived task deletion.
For some reason i have error of missing S3 keys in clearml docker compose logs, the folder / files are not deleted in S3 bucket.
You can see how storage_credentials.co...
- Here is how client side clearml.conf looks like together with the script im using to create the tasks. Uploads seems to work and is fixed thanks to you guys 🙌
I know these keys work, url and everything else works because I use these creds daily
I do notice another strange thing
Agent-services is down because It has no API key to clearm
I see the debug images in fileserver folder
hi, thanks for reaching out. Getting desperate here.
Yes, its self hosted
No, only currently running experiments are deleted (task itself is gone, but debug images and models are present in fileserver folder)
What I do see is some random elastisearch errors popping up from time to time
[2024-01-05 09:16:47,707] [9] [WARNING] [elasticsearch] POST
None ` [status:N/A requ...
- is 50GB elastisearch normal? Have you seen it. elsewhere or are we doing something wrong, one thing I think is that we are probably logging too frequently
- Is it possible to somehow clean up this?
elastisearch also takes like 15GB of ram
Not really, but i think i will figure out the uv caching
I have another question @<1523701070390366208:profile|CostlyOstrich36>
How can i make the clearml agent to just run the image with just the uv
dont install any packages, nothing
i found docker_init_bash_script in clearml.config
i know there are some envs to pass in task init but that does not fully do what i want - just simply run the image, i have all the dependencies
Im basically trying to force the agent to use uv defined python
@<1523701087100473344:profile|SuccessfulKoala55> Anything on this?