Hi @<1668065560107159552:profile|VivaciousPenguin20>
I think you are looking at the wrong experiment, this is a 3 year old experiment ? this does not seem to be your currently executed experiment, right?
@<1668065560107159552:profile|VivaciousPenguin20> , I think this might indicate there's some issue with the ElasticSearch component of the server - can you see the experiment console log?
Thank you for your replies @<1523701205467926528:profile|AgitatedDove14> and @<1523701087100473344:profile|SuccessfulKoala55> These were created as sample experiments from Allegro, so that's why they are 3 years old. I agree it seems like Elastic Search is the culpit. The console logs for the experiment are missing as well:
Can you please run sudo docker ps
and share the result?
ONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
698a92273675 c027a58fa0bb "/storage-provisione…" 22 hours ago Up 22 hours k8s_storage-provisioner_storage-provisioner_kube-system_89536357-464b-4b68-86c8-f4488a816dea_11
c55b66a2f99d 3750dfec169f "/kube-vpnkit-forwar…" 22 hours ago Up 22 hours k8s_vpnkit-controller_vpnkit-controller_kube-system_9e14b907-dfb9-4dbf-920e-4fbc70e9da88_6
3337c6862569 2437cf762177 "/coredns -conf /etc…" 22 hours ago Up 22 hours k8s_coredns_coredns-76f75df574-5vc6m_kube-system_a8d77e25-6443-4499-8b59-72d4ecf11a2f_6
8e41261214e2 2437cf762177 "/coredns -conf /etc…" 22 hours ago Up 22 hours k8s_coredns_coredns-76f75df574-s45qr_kube-system_30a79510-aed8-424e-986a-63b334fb4f46_6
bcc0217a160b b125ba1d1878 "/usr/local/bin/kube…" 22 hours ago Up 22 hours k8s_kube-proxy_kube-proxy-jjscn_kube-system_ea5b4d35-6f75-4dac-bf33-5bf0fdad783a_6
75d160151019 registry.k8s.io/pause:3.9 "/pause" 22 hours ago Up 22 hours k8s_POD_vpnkit-controller_kube-system_9e14b907-dfb9-4dbf-920e-4fbc70e9da88_6
6524ce20e1b7 registry.k8s.io/pause:3.9 "/pause" 22 hours ago Up 22 hours k8s_POD_coredns-76f75df574-5vc6m_kube-system_a8d77e25-6443-4499-8b59-72d4ecf11a2f_6
7f3eee6103a5 registry.k8s.io/pause:3.9 "/pause" 22 hours ago Up 22 hours k8s_POD_coredns-76f75df574-s45qr_kube-system_30a79510-aed8-424e-986a-63b334fb4f46_6
daa8a97bc72b registry.k8s.io/pause:3.9 "/pause" 22 hours ago Up 22 hours k8s_POD_storage-provisioner_kube-system_89536357-464b-4b68-86c8-f4488a816dea_6
28286159735f registry.k8s.io/pause:3.9 "/pause" 22 hours ago Up 22 hours k8s_POD_kube-proxy-jjscn_kube-system_ea5b4d35-6f75-4dac-bf33-5bf0fdad783a_6
1c29f3641196 79f8d13ae8b8 "etcd --advertise-cl…" 22 hours ago Up 22 hours k8s_etcd_etcd-docker-desktop_kube-system_a7259c8a6f480a66649ce97631b20e6f_6
805632e957ee registry.k8s.io/pause:3.9 "/pause" 22 hours ago Up 22 hours k8s_POD_etcd-docker-desktop_kube-system_a7259c8a6f480a66649ce97631b20e6f_6
dc69e5571635 f3b81ff188c6 "kube-apiserver --ad…" 22 hours ago Up 22 hours k8s_kube-apiserver_kube-apiserver-docker-desktop_kube-system_0ebf02f01020bac6394d8c559802bcc8_6
5903d907ccb7 140ecfd0789f "kube-scheduler --au…" 22 hours ago Up 22 hours k8s_kube-scheduler_kube-scheduler-docker-desktop_kube-system_8dc7392ffeee7cf9ac30dda5e5775176_6
4d1851742bbe 8715bb0e3bc2 "kube-controller-man…" 22 hours ago Up 22 hours k8s_kube-controller-manager_kube-controller-manager-docker-desktop_kube-system_af7b12e5509cb13b2c1d769bc20867d1_6
5984e4b84b04 registry.k8s.io/pause:3.9 "/pause" 22 hours ago Up 22 hours k8s_POD_kube-apiserver-docker-desktop_kube-system_0ebf02f01020bac6394d8c559802bcc8_6
58a8472d1265 registry.k8s.io/pause:3.9 "/pause" 22 hours ago Up 22 hours k8s_POD_kube-scheduler-docker-desktop_kube-system_8dc7392ffeee7cf9ac30dda5e5775176_6
17e0992e90f8 registry.k8s.io/pause:3.9 "/pause" 22 hours ago Up 22 hours k8s_POD_kube-controller-manager-docker-desktop_kube-system_af7b12e5509cb13b2c1d769bc20867d1_6
5aff4cfc8f5d allegroai/clearml:latest "python3 -m jobs.asy…" 22 hours ago Up 22 hours 8008/tcp, 8080-8081/tcp async_delete
da1740eab329 allegroai/clearml:latest "/opt/clearml/wrappe…" 22 hours ago Up 22 hours 8008/tcp, 8080-8081/tcp, 0.0.0.0:8080->80/tcp clearml-webserver
76794de9d840 allegroai/clearml:latest "/opt/clearml/wrappe…" 22 hours ago Up 22 hours 0.0.0.0:8008->8008/tcp, 8080-8081/tcp clearml-apiserver
8c39b79f6645 docker.elastic.co/elasticsearch/elasticsearch:7.17.18 "/bin/tini -- /usr/l…" 22 hours ago Up 22 hours 0.0.0.0:9200->9200/tcp, 0.0.0.0:9300->9300/tcp clearml-elastic
eca52b0381fc allegroai/clearml:latest "/opt/clearml/wrappe…" 29 hours ago Up 22 hours 8008/tcp, 8080/tcp, 0.0.0.0:8081->8081/tcp clearml-fileserver
0f150d9a9bb2 mongo:4.4.9 "docker-entrypoint.s…" 29 hours ago Up 22 hours 27017/tcp clearml-mongo
fde4d437e5f2 redis:5.0 "docker-entrypoint.s…" 29 hours ago Up 22 hours 6379/tcp clearml-redis
Please ignore the k8s containers since I have that running in my Docker Desktop install. ElasticSearch seems to be running but the sample projects from Allegro are missing the plots, console, and other visualizations. I can create the visualizations myself though through the APIs.
And when you create them yourself you can see them?
Also, what's the server version you have deployed?
@<1523701087100473344:profile|SuccessfulKoala55> correct. when I create visualizations myself via APIs I see them. Here's my docker compose. I ended up upgrading to newer elastic search image to get the setup running on macbook m1:
version: "3.6"
services:
apiserver:
command:
- apiserver
container_name: clearml-apiserver
image: allegroai/clearml:latest
restart: unless-stopped
volumes:
- /opt/clearml/logs:/var/log/clearml
- /opt/clearml/config:/opt/clearml/config
- /opt/clearml/data/fileserver:/mnt/fileserver
depends_on:
- redis
- mongo
- elasticsearch
- fileserver
environment:
CLEARML_ELASTIC_SERVICE_HOST: elasticsearch
CLEARML_ELASTIC_SERVICE_PORT: 9200
CLEARML_ELASTIC_SERVICE_PASSWORD: ${ELASTIC_PASSWORD}
CLEARML_MONGODB_SERVICE_HOST: mongo
CLEARML_MONGODB_SERVICE_PORT: 27017
CLEARML_REDIS_SERVICE_HOST: redis
CLEARML_REDIS_SERVICE_PORT: 6379
CLEARML_SERVER_DEPLOYMENT_TYPE: ${CLEARML_SERVER_DEPLOYMENT_TYPE:-linux}
CLEARML__apiserver__pre_populate__enabled: "true"
CLEARML__apiserver__pre_populate__zip_files: "/opt/clearml/db-pre-populate"
CLEARML__apiserver__pre_populate__artifacts_path: "/mnt/fileserver"
CLEARML__services__async_urls_delete__enabled: "true"
CLEARML__services__async_urls_delete__fileserver__url_prefixes: "[${CLEARML_FILES_HOST:-}]"
ports:
- "8008:8008"
networks:
- backend
- frontend
elasticsearch:
networks:
- backend
container_name: clearml-elastic
ports:
- "9200:9200"
- "9300:9300"
environment:
ES_JAVA_OPTS: -Xms2g -Xmx2g -Dlog4j2.formatMsgNoLookups=true
ELASTIC_PASSWORD: ${ELASTIC_PASSWORD}
bootstrap.memory_lock: "true"
bootstrap.system_call_filter: false
cluster.name: clearml
cluster.routing.allocation.node_initial_primaries_recoveries: "500"
cluster.routing.allocation.disk.watermark.low: 500mb
cluster.routing.allocation.disk.watermark.high: 500mb
cluster.routing.allocation.disk.watermark.flood_stage: 500mb
discovery.zen.minimum_master_nodes: "1"
discovery.type: single-node
http.compression_level: "7"
node.ingest: "true"
node.name: clearml
reindex.remote.whitelist: '*.*'
xpack.monitoring.enabled: "false"
xpack.security.enabled: "false"
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 262144
hard: 262144
# image: docker.elastic.co/elasticsearch/elasticsearch:7.17.7-arm64
image: docker.elastic.co/elasticsearch/elasticsearch:7.17.18
restart: unless-stopped
volumes:
- /opt/clearml/data/elastic_7:/usr/share/elasticsearch/data:rw
- /usr/share/elasticsearch/logs
fileserver:
networks:
- backend
- frontend
command:
- fileserver
container_name: clearml-fileserver
image: allegroai/clearml:latest
environment:
CLEARML__fileserver__delete__allow_batch: "true"
restart: unless-stopped
volumes:
- /opt/clearml/logs:/var/log/clearml
- /opt/clearml/data/fileserver:/mnt/fileserver
- /opt/clearml/config:/opt/clearml/config
ports:
- "8081:8081"
mongo:
networks:
- backend
container_name: clearml-mongo
image: mongo:4.4.9
restart: unless-stopped
command: --setParameter internalQueryMaxBlockingSortMemoryUsageBytes=196100200
volumes:
- /opt/clearml/data/mongo_4/db:/data/db
- /opt/clearml/data/mongo_4/configdb:/data/configdb
redis:
networks:
- backend
container_name: clearml-redis
image: redis:5.0
restart: unless-stopped
volumes:
- /opt/clearml/data/redis:/data
webserver:
command:
- webserver
container_name: clearml-webserver
# environment:
# CLEARML_SERVER_SUB_PATH : clearml-web # Allow Clearml to be served with a URL path prefix.
image: allegroai/clearml:latest
restart: unless-stopped
depends_on:
- apiserver
ports:
- "8080:80"
networks:
- backend
- frontend
async_delete:
depends_on:
- apiserver
- redis
- mongo
- elasticsearch
- fileserver
container_name: async_delete
image: allegroai/clearml:latest
networks:
- backend
restart: unless-stopped
environment:
CLEARML_ELASTIC_SERVICE_HOST: elasticsearch
CLEARML_ELASTIC_SERVICE_PORT: 9200
CLEARML_ELASTIC_SERVICE_PASSWORD: ${ELASTIC_PASSWORD}
CLEARML_MONGODB_SERVICE_HOST: mongo
CLEARML_MONGODB_SERVICE_PORT: 27017
CLEARML_REDIS_SERVICE_HOST: redis
CLEARML_REDIS_SERVICE_PORT: 6379
PYTHONPATH: /opt/clearml/apiserver
CLEARML__services__async_urls_delete__fileserver__url_prefixes: "[${CLEARML_FILES_HOST:-}]"
entrypoint:
- python3
- -m
- jobs.async_urls_delete
- --fileserver-host
-
volumes:
- /opt/clearml/logs:/var/log/clearml
- /opt/clearml/config:/opt/clearml/config
agent-services:
networks:
- backend
container_name: clearml-agent-services
image: allegroai/clearml-agent-services:latest
deploy:
restart_policy:
condition: on-failure
privileged: true
environment:
CLEARML_HOST_IP: ${CLEARML_HOST_IP}
CLEARML_WEB_HOST: ${CLEARML_WEB_HOST:-}
CLEARML_API_HOST:
CLEARML_FILES_HOST: ${CLEARML_FILES_HOST:-}
CLEARML_API_ACCESS_KEY: ${CLEARML_API_ACCESS_KEY:-}
CLEARML_API_SECRET_KEY: ${CLEARML_API_SECRET_KEY:-}
CLEARML_AGENT_GIT_USER: ${CLEARML_AGENT_GIT_USER}
CLEARML_AGENT_GIT_PASS: ${CLEARML_AGENT_GIT_PASS}
CLEARML_AGENT_UPDATE_VERSION: ${CLEARML_AGENT_UPDATE_VERSION:->=0.17.0}
CLEARML_AGENT_DEFAULT_BASE_DOCKER: "ubuntu:18.04"
AWS_ACCESS_KEY_ID: ${AWS_ACCESS_KEY_ID:-}
AWS_SECRET_ACCESS_KEY: ${AWS_SECRET_ACCESS_KEY:-}
AWS_DEFAULT_REGION: ${AWS_DEFAULT_REGION:-}
AZURE_STORAGE_ACCOUNT: ${AZURE_STORAGE_ACCOUNT:-}
AZURE_STORAGE_KEY: ${AZURE_STORAGE_KEY:-}
GOOGLE_APPLICATION_CREDENTIALS: ${GOOGLE_APPLICATION_CREDENTIALS:-}
CLEARML_WORKER_ID: "clearml-services"
CLEARML_AGENT_DOCKER_HOST_MOUNT: "/opt/clearml/agent:/root/.clearml"
SHUTDOWN_IF_NO_ACCESS_KEY: 1
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- /opt/clearml/agent:/root/.clearml
depends_on:
- apiserver
entrypoint: >
bash -c "curl --retry 10 --retry-delay 10 --retry-connrefused '
' && /usr/agent/entrypoint.sh"
networks:
backend:
driver: bridge
frontend:
driver: bridge
Hi @<1668065560107159552:profile|VivaciousPenguin20> , what version of the apiserver are you running? Can you please try switching to the latest v1.14.1 version that was released last week. One of the issues fixed was the inability to import events for the published example tasks
@<1523701994743664640:profile|AppetizingMouse58> I just tried 1.14.1 docker image for clearml and the situation with example projects is the same per the attached screenshot
@<1668065560107159552:profile|VivaciousPenguin20> Did you re import the example projects after upgrading to v1.14.1? The problem was in the import procedure itself. The tasks that were imported in the previous versions will not have task results