In order to set the dicts that will be passed to Elasticsearch
, you'll need to provide such a dict for each "connection" used by the server, these are events
and workers
, you can see those in the hosts.conf
file https://github.com/allegroai/clearml-server/blob/master/apiserver/config/default/hosts.conf
Oh, this is using our cloud-ready Helm chart?
I am already changing the chart, it would be no problem. But which env vars?
Okayyy. I'll try this. Thanks for your help Jake!
Yeah, but they're using http_auth
there, aren't they?
In a helm chart, I assume this would translate to something like:- name: CLEARML__HOSTS__ELASTIC__EVENTS value: '{hosts: [{host: "127.0.0.1", port: 9200}], args { timeout: 60, dead_timeout: 10, max_retries: 3, retry_on_timeout: true}, index_version: "1"}'
standard URI
according to:
https://elasticsearch-py.readthedocs.io/en/v7.12.1/#tls-ssl-and-authentication
In any case, you can do that "manually", but that would require changing the chart since you need to inject several new env-vars
I'm not sure user:password
is supported...
It's concating them together:[WARNING] [elasticsearch] GET http://[
https://elasti ....
│ urllib3.exceptions.LocationParseError: Failed to parse: '
https://elastic:xxxxxxxxxxxxxxxxx@clearml-elasticsearch-es-http ', label empty or too long │
│ [2021-05-11 13:35:54,816] [8] [WARNING] [clearml.initialize] Could not connect to ElasticSearch Service. Retry 1 of 4. Waiting for 30sec │
Well, the ES driver used by the server should handle a value such as https://....
Exactly 😄 - however, since the server uses pyhocon to parse those vars (that's why you can provide an entire structure there, not just text), make sure to provide a value that can actually be parsed as such (i.e. the string the server should receive for each "connection" should be something like{hosts: [{host: "127.0.0.1", port: 9200}], args { timeout: 60, dead_timeout: 10, max_retries: 3, retry_on_timeout: true}, index_version: "1"}
etc, and NOT"{hosts: [{host: "127.0.0.1", port: 9200}], args { timeout: 60, dead_timeout: 10, max_retries: 3, retry_on_timeout: true}, index_version: "1"}"
(notice the problematic "
wrapping the second example)
I'll try that. How can I make it to use authentication too?
Hmm maybe this will work
https://elasticsearch-py.readthedocs.io/en/v7.12.1/#tls-ssl-and-authentication
I'll try that and report back
We can always add that explicitly if it doesn't work 🙂
As far as I know if the host
param contains https
the driver infers the use_ssl
by itself
You'll have to set each one (the format there is the one expected by Elasticsearch
), you can do that either using a mounted config file (called hosts.conf
), or using env vars - which do you prefer?
it's handled implicitly by the ES driver - you just need to use an https
address
See here: https://elasticsearch-py.readthedocs.io/en/v7.12.1/api.html#elasticsearch
` # connect to localhost directly and another node using SSL on port 443
and an url_prefix. Note that port
needs to be an int.
es = Elasticsearch([
{'host': 'localhost'},
{'host': 'othernode', 'port': 443, 'url_prefix': 'es', 'use_ssl': True},
]) `
aha got it..
I assume env vars would be configured according to this doc:
https://allegro.ai/clearml/docs/docs/deploying_clearml/clearml_server_config.html#dynamic-environment-variables
See here: https://github.com/allegroai/clearml-server/blob/2216bfe8758f0095b929ec6b35eee3647d1c387c/apiserver/es_factory.py#L69hosts
is simply a dict which eventually contains configuration properties per host for the Elasticsearch
instance, of these properties, the override you use sets the host
and port
values