Sure. I'll give it a few minor releases and then try again 🙂 Thanks for the responses @<1722061389024989184:profile|ResponsiveKoala38> !
It's running v7.17.18 @<1722061389024989184:profile|ResponsiveKoala38>
Sorry for the late reply @<1722061389024989184:profile|ResponsiveKoala38> . So this is the diff between my local version (hosted together on a single server with docker-compose). Does anything spring to mind?
I think its possible there was an upgrade in Elastic, I'd suggest going over the release notes to see if this happened with the server
diff --git a/docker-compose.yml b/docker-compose.diff.yml
index c6b49e1..07f7f43 100644
--- a/docker-compose.yml
+++ b/docker-compose.diff.yml
@@ -5,7 +5,7 @@ services:
command:
- apiserver
container_name: clearml-apiserver
- image: allegroai/clearml:1.15.0
+ image: allegroai/clearml:latest
restart: unless-stopped
volumes:
- /opt/clearml/logs:/var/log/clearml
@@ -19,17 +19,18 @@ services:
environment:
CLEARML_ELASTIC_SERVICE_HOST: elasticsearch
CLEARML_ELASTIC_SERVICE_PORT: 9200
- CLEARML_ELASTIC_SERVICE_PASSWORD: ${ELASTIC_PASSWORD}
CLEARML_MONGODB_SERVICE_HOST: mongo
CLEARML_MONGODB_SERVICE_PORT: 27017
CLEARML_REDIS_SERVICE_HOST: redis
CLEARML_REDIS_SERVICE_PORT: 6379
- CLEARML_SERVER_DEPLOYMENT_TYPE: ${CLEARML_SERVER_DEPLOYMENT_TYPE:-linux}
+ CLEARML_SERVER_DEPLOYMENT_TYPE: linux
CLEARML__apiserver__pre_populate__enabled: "true"
CLEARML__apiserver__pre_populate__zip_files: "/opt/clearml/db-pre-populate"
CLEARML__apiserver__pre_populate__artifacts_path: "/mnt/fileserver"
CLEARML__services__async_urls_delete__enabled: "true"
CLEARML__services__async_urls_delete__fileserver__url_prefixes: "[${CLEARML_FILES_HOST:-}]"
+ CLEARML__secure__credentials__services_agent__user_key: ${CLEARML_AGENT_ACCESS_KEY:-}
+ CLEARML__secure__credentials__services_agent__user_secret: ${CLEARML_AGENT_SECRET_KEY:-}
ports:
- "8008:8008"
networks:
@@ -41,8 +42,6 @@ services:
- backend
container_name: clearml-elastic
environment:
- ES_JAVA_OPTS: -Xms2g -Xmx2g -Dlog4j2.formatMsgNoLookups=true
- ELASTIC_PASSWORD: ${ELASTIC_PASSWORD}
bootstrap.memory_lock: "true"
cluster.name: clearml
cluster.routing.allocation.node_initial_primaries_recoveries: "500"
@@ -74,7 +73,7 @@ services:
command:
- fileserver
container_name: clearml-fileserver
- image: allegroai/clearml:1.15.0
+ image: allegroai/clearml:latest
environment:
CLEARML__fileserver__delete__allow_batch: "true"
restart: unless-stopped
@@ -111,12 +110,12 @@ services:
container_name: clearml-webserver
# environment:
# CLEARML_SERVER_SUB_PATH : clearml-web # Allow Clearml to be served with a URL path prefix.
- image: allegroai/clearml:1.15.0
+ image: allegroai/clearml:latest
restart: unless-stopped
depends_on:
- apiserver
ports:
- - "80:80"
+ - "8080:80"
networks:
- backend
- frontend
@@ -129,14 +128,13 @@ services:
- elasticsearch
- fileserver
container_name: async_delete
- image: allegroai/clearml:1.15.0
+ image: allegroai/clearml:latest
networks:
- backend
restart: unless-stopped
environment:
CLEARML_ELASTIC_SERVICE_HOST: elasticsearch
CLEARML_ELASTIC_SERVICE_PORT: 9200
- CLEARML_ELASTIC_SERVICE_PASSWORD: ${ELASTIC_PASSWORD}
CLEARML_MONGODB_SERVICE_HOST: mongo
CLEARML_MONGODB_SERVICE_PORT: 27017
CLEARML_REDIS_SERVICE_HOST: redis
@@ -157,7 +155,7 @@ services:
networks:
- backend
container_name: clearml-agent-services
- image: allegroai/clearml-agent-services:services-1.3.0-77
+ image: allegroai/clearml-agent-services:latest
deploy:
restart_policy:
condition: on-failure
@@ -167,8 +165,8 @@ services:
CLEARML_WEB_HOST: ${CLEARML_WEB_HOST:-}
CLEARML_API_HOST:
CLEARML_FILES_HOST: ${CLEARML_FILES_HOST:-}
- CLEARML_API_ACCESS_KEY: ${CLEARML_API_ACCESS_KEY:-}
- CLEARML_API_SECRET_KEY: ${CLEARML_API_SECRET_KEY:-}
+ CLEARML_API_ACCESS_KEY: ${CLEARML_AGENT_ACCESS_KEY:-$CLEARML_API_ACCESS_KEY}
+ CLEARML_API_SECRET_KEY: ${CLEARML_AGENT_SECRET_KEY:-$CLEARML_API_SECRET_KEY}
CLEARML_AGENT_GIT_USER: ${CLEARML_AGENT_GIT_USER}
CLEARML_AGENT_GIT_PASS: ${CLEARML_AGENT_GIT_PASS}
CLEARML_AGENT_UPDATE_VERSION: ${CLEARML_AGENT_UPDATE_VERSION:->=0.17.0}
Hi @<1523701601770934272:profile|GiganticMole91> , I do not see any difference that could lead to Elasticsearch from v1.16 not being able to start from the data that was stored in v1.15. Probably some more information can be retrieved from ES logs right after the upgrade and services restart. If there are some reasons the prevent ES from loading the existing data they may be listed in the logs
@<1523701601770934272:profile|GiganticMole91> Is your ES deployment a single node or a cluster? If you compare the elaststicsearch section of the docker compose of your currently working version (1.15.0) and then one that you tried to install (v1.16) do you see any difference?
Hi @<1523701601770934272:profile|GiganticMole91> , what is the exact version of Elasticsearch that is running now in your 1.15.0 installation? You can see it in the output of 'sudo docker ps'