Sorry to keep this up - what about support for minio using the environment variable? Do I set the CLEARML_FILES_HOST
to the end point instead of an s3 bucket?
Do I set the
CLEARML_FILES_HOST
to the end point instead of an s3 bucket?
Yes you are right this is not straight forward:CLEARML_FILES_HOST="
s3://minio_ip:9001 "
Notice you must specify "port" , this is how it knows this is not AWS. I would avoid using an IP and register the minio as a host on your local DNS / firewall. This way if you change the IP the links will not get broken 🙂
Thanks! That's what I thought, but then I get2021-12-21 22:08:35,376 - clearml.storage - ERROR - Failed uploading: Parameter validation failed: Invalid bucket name "": Bucket name must match the regex "^[a-zA-Z0-9.\-_]{1,255}$" or be an ARN matching the regex "^arn:(aws).*:(s3|s3-object-lambda):[a-z\-0-9]*:[0-9]{12}:accesspoint[/:][a-zA-Z0-9\-.]{1,63}$|^arn:(aws).*:s3-outposts:[a-z\-0-9]+:[0-9]{12}:outpost[/:][a-zA-Z0-9\-]{1,63}[/:]accesspoint[/:][a-zA-Z0-9\-]{1,63}$"
If I add the bucket to that (so CLEARML_FILES_HOST=
s3://minio_ip:9000/minio/bucket ), I then get the following error instead --
2021-12-21 22:14:55,518 - clearml.storage - ERROR - Failed uploading: SSL validation failed for
... [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1076)
If I add the bucket to that ....
Oh no .... you should also set SSL off for the connection, but I think this is only in the clearml.conf:
https://github.com/allegroai/clearml/blob/fd2d6c6f5d46cad3e406e88eeb4d805455b5b3d8/docs/clearml.conf#L101
maybe we should add some ENV setting it? (I'm not sure we should disable SSL for all S3 connections... so somehow specify the mino it should use http with)
AgitatedDove14 another option I thought would be nice is to actually self-sign the internal MinIO bucket, but then I get[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate (_ssl.c:1076)
Are you aware of any other way then (other than the secure: false
flag?
Are you aware of any other way then (other than the
secure: false
flag?
Actually self -signing and providing certificate file is already supported with boto (and thus clearml)
AWS_CA_BUNDLE
https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html
That's what I found as well, but it did not like it after all (boto is fine with it, but underlying urllib
and requests
were not?)
It's fine -- I see the added benefit in making sure the users set up their clearml.conf
and I've made a script to edit it to our needs as part of the installation process 🙂 Thanks Martin!
and I've made a script to edit it to our needs as part of the installation process
Thanks Martin!
My pleasure, btw: there is no actual need to configure all the clearml.conf values. It will actually take the defaults from the clearml package itself. This means you only need something like:
` api {
server config here
}
sdk.aws.s3{
minio config here
} `
Yes, exactly! I've added instructions for the users on creating their account and running clearml-init
, and then they run the snippet that updates the api and sdk sections.
Or did you mean I can couple a short "mini config" with the package and redirect clearml to use this local one (instead of the one at ~/clearml.conf)?
Or did you mean I can couple a short "mini config" with the package and redirect clearml to use this local one (instead of the one at ~/clearml.conf)?
Actually yes, you can set a "fixed" config point to it with ENV variable, then setup per user just the access/secret .
wdyt?
(I was also pointing to the fact you do not have to use clearml-init you can create a simple partial config template and let user just fill in the missing "key"/"secret")
The key/secret is also shared internally so that sounds like a nice mitigation actually!
Which environment variable am I looking for? I couldn't spot anything specifically in that environment variables page
Is it CLEARML_CONFIG_FILE
? (I had to dig this from the GH code 😅 )
Is it
CLEARML_CONFIG_FILE
? (I had to dig this from the GH code
)
Yes it is !
https://clear.ml/docs/latest/docs/faq#clearml-configuration
(I will make sure we add it to https://clear.ml/docs/latest/docs/configs/env_vars#server-connection as well 🙂 )
I will TIAS, but maybe worthwhile to also mention if it has to be the absolute path or if relative path is fine too!
I will TIAS, but maybe worthwhile to also mention if it has to be the absolute path or if relative path is fine too!
Good point! (absolute but you can use ~, and I "think" also $ENV )
~
is a bit weird since it's not part of the package (might as well let the user go through clearml-init
), but using ${PWD} works! 👍 👍
(Though I still had to add the CLEARML_API_HOST and CLEARML_WEB_HOST ofc, or define them in the clearml.conf)
${PWD} works!
This will be resolved every call to Task.init (so I would recommend against it), how about "$HOME/" ?
That's fine for the current use-case I believe.
Once the team is happy with the logging functionality, we'll move on to remote execution and things will update.
One last MinIO-related question (sorry for the long thread!)
While I do have the access and secret defined in clearml.conf, and even in the WebUI, I still get similar warnings as David does here - https://clearml.slack.com/archives/CTK20V944/p1640135359125200
Once the team is happy with the logging functionality, we'll move on to remote execution and things will update.
🎉
While I do have the access and secret defined in clearml.conf, and even in the WebUI, I still get similar
and you have your credentials in the browser when deleting a Task ?
In the Profile section, yes, they are well defined (bucket, secret, key, and endpoint)
The odd thing is that it was already defined, and then when I clicked an S3 link, it asked me to fill it in again, adding a duplicate credentials row
UnevenDolphin73 it seems this is a UI browser limit, this means we will need to move it into the server ...
See here: https://clearml.slack.com/archives/CTK20V944/p1640247879153700?thread_ts=1640135359.125200&cid=CTK20V944
Always great to find a bug! I'll make relevant SDK updates then.