Hi @<1590514584836378624:profile|AmiableSeaturtle81> ! To help us debug this: are you able to simply use the boto3
python package to interact with your cluster?
If so, how does that code look like? This would give us some insight on how the config should actually look like or what changes need to be made.
Adding bucket in clearml.conf causes the same error: clearml.storage - ERROR - Failed uploading: Could not connect to the endpoint URL: " None "
- Here is how client side clearml.conf looks like together with the script im using to create the tasks. Uploads seems to work and is fixed thanks to you guys 🙌
It looks like im moving forward
Setting url in clearml.conf without "s3" as suggested works (But I dont add port ther, not sure if it breaks something, we dont have a port)
host: " our-host.com "
Then in test_task.py
task: clearml.Task = clearml.Task.init(
project_name="project",
task_name="task",
output_uri=" None ",
)
I think connection is created
What im getting now is bucket error, i suppose I have to specify it somewhere?
This is unrelated to your routers. There are two things at play here. The configuration of WHERE the data will go - output_uri
and the other is the clearml.conf
that you need to setup with credentials. I am telling you, you are setting it wrong. Please follow documentation.
Hey, i see that 1.14.2 dropped
I tried it but the issue is still there, maybe the hotfix is in next patch?
Here is the setup so you can reproduce it (we dont have region field)
clearml.conf:s3 {
use_credentials_chain: false
credentials: [
{
host: "
s3.somehost.com "
key: "XXXXXXXXXXXXXXXXXXXX"
secret: "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
bucket: "rnd-dev"
},
]
}
test.py
task: clearml.Task = clearml.Task.init(
project_name="project",
task_name="task",
output_uri="
None ",
)
2024-02-08 11:23:52,150 - clearml.storage - ERROR - Failed creating storage object
None Reason: Missing key and secret for S3 storage access (
None )
I dont have a region. I guess I will wait till tomarrow then?
also, when uploading artifacts, I see where they are stored on the s3 bucket, but I cant find where the debug images are stored at
btw @<1590514584836378624:profile|AmiableSeaturtle81> , can you try to specify the host without http*
and try to set the port to 443? like s3.my _host:443
(or even without the port)
Setting these urls in SETTINGS/Configuration/ WEB APP CLOUD ACCESS in web ui
None doesnt work
None doesnt work
None doesnt work
None doesnt work
None gets replaced to None ://s3.host-our.com:8080 doesnt work
None doesnt work
None doesnt work
In all of these instances The S3 CREDENTIALS popup never dissapears, it will still popup always asking for creds no matter how I try to set the creds
@<1523703436166565888:profile|DeterminedCrab71> Thanks for responding
It was unclear to me that I need to set 443 also everywhere in clearml.conf
Setting s3 host urls with 443 in clearml.conf and also in web UI made it work
Im now almost at the finish line. The last thing that would be great is to fix archived task deletion.
For some reason i have error of missing S3 keys in clearml docker compose logs, the folder / files are not deleted in S3 bucket.
You can see how storage_credentials.conf looks like for me (first image). The same as for client clearml.conf (with port as you suggested)
I have the storage_credentials.conf mounted inside of async_delete as a volume
I have also confirmed that mounting works and i have the storage_credentials.conf inside of async_delete container config folder.
Maybe im misconfiguring someting?
@<1590514584836378624:profile|AmiableSeaturtle81> if you wish for you debug samples to be uploaded to s3 you have 2 options: you either use this function: None
or you can change the api.files_server
entry to your s3 bucket in clearml.conf
. This way you wouldn't need to call set_default_upload_destination
every time you run a new script.
Also, in clearml.conf
, you can change sdk.development.default_output_uri
such that you don't need to set output_uri="s3://...
every time in Task.init
But there are stil some wierd issues, i cannot see the files uploaded in bucket
@<1523701435869433856:profile|SmugDolphin23> Any news?
we use Ceph Storage Cluster, interface to it is the same as S3
I dont get what I have misconfigured.
The only thing I have not added is "region" field in clearml.conf because we literally dont have, its a self hosted cluster.
You can try and replicate this s3 config I have posted earlier.
Specifying it like this, gets me different error:
Exception has occurred: ValueError
- Insufficient permissions (delete failed) for None
botocore.exceptions.ClientError: An error occurred (IllegalLocationConstraintException) when calling the DeleteObject operation: The me-south-1 location constraint is incompatible for the region specific endpoint this request was sent to.
During handling of the above exception, another exception occurred:
File "/home/ma/src/clearml-server/task_test.py", line 10, in <module>
task: clearml.Task = clearml.Task.init(
ValueError: Insufficient permissions (delete failed) for None
host: "my-minio-host:9000"
The port should be whatever port that is used by your S3 solution
Hi @<1590514584836378624:profile|AmiableSeaturtle81> ! We have someone investigating the UI issue (I mainly work on the sdk). They will get back to you once they find something...
there is a typing in clearm.conf i sent you on like 87, there should be "key" not "ey" im aware of it
@<1523701070390366208:profile|CostlyOstrich36> Hello John, we are still unable to use clearml with our self hosted s3 CEPH instances, is there any update on the hotfix for 1.14?
Yes, credetials seems to work
Im trying to figure out not why I dont see the uploaded files / folders
- I checked maybe clearml task uses fileserver instead but i dont see any files in fileserver folder
- Nothing is uploaded in bucket (i will ask IT guy to check if im uploading any files in logs)
you might want to prefix both the host
in the configuration file and the uri in Task.init
/ StorageHelper.get
with s3.
if the script above works if you do that
@<1590514584836378624:profile|AmiableSeaturtle81> ok, I think that your credentials from clearml.conf are actually working now. let's not change them.
Now let's try this simple code:
from clearml import Task
import numpy as np
if __name__ == "__main__":
task = Task.init(task_name="test4", project_name="test4", output_uri="
")
image = np.random.randint(0, 256, size=(500, 1000, 3), dtype=np.uint8)
task.upload_artifact("image", image)
You should change the task_name
and project_name
from test
just in case some object has been created previously