@<1722061389024989184:profile|ResponsiveKoala38> There were errors everywhere, in almost every container. I decided to manually move working clearml image from old installation and run from it. It worked. It's not the latest ClearML but it'll do for now. Thanks for your help! 💪
It seems that only async_delete container is using the lastest version
@<1722061389024989184:profile|ResponsiveKoala38> Thank a lot! I am gonna upgrade ClearML using this link: None
Hi @<1722061389024989184:profile|ResponsiveKoala38> , I am using those specific versions because my previous ClearML installation runs with such versions, they are in docker compose file. Version of ClearML image is 1. Afaik the latest is 1.16.2. My goal is to move ClearML to a different machine so I need to stick to those versions
@<1722061389024989184:profile|ResponsiveKoala38> Hello. It seems that it didn't work for me. I made a backup, moved it to another machine and tried to run clearml service (latest docker compose). Now, I have async-delete, apiserver, mongo, fileserver, elastic constantly restarting
So, right now I have old deployment. It's working good, it's not corrupted. Service versions I shared above (output of docker ps). My goal is to move everything to another machine. Yes, I want to have a new deployment with all previous data. Basically, it's backup and restore task. The problem was that old docker compose file doesn't work as is. Maybe because when I run it on a new machine clearml:1 is pulling the latest version and elastic version is set to one that is no longer supported.
Hi @<1526734383564722176:profile|BoredBat47> , it seems that your Elasticsearch version is out of sync with what the latest version of the apiserver requires (7.17.18). Can you please follow the instructions here to make sure that you use the latest images for the ClearML Server?
None
clearml:1 as well as clearml:latest point to the latest version which is 1.16.2:
None
This version is not able to work with the version of Elasticsearch that you use. I suggest to use the docker compose from the latest version that uses the updated versions of all the required infrastructure
Yeah, I mean fresh installation using old docker compose file. Just without backups (/clearml/data). So it seems the solution to me should be:
- Migrate to the latest version of elastic on old installation
- Make a backup
- Deploy latest ClearML installation with that backup
Thanks a lot. I see that ClearML apiserver is up for 7 months, could it be that it runs on a version that was recent 7 month ago?
Yes, it seems so. I though we were talking about the fresh installation that you did.
482e96243041 allegroai/clearml:latest "python3 -m jobs.asy…" 18 months ago Up 7 weeks 8008/tcp, 8080-8081/tcp async_delete
26c677f2b70f allegroai/clearml:1 "/opt/clearml/wrappe…" 18 months ago Up 16 months 8008/tcp, 8080-8081/tcp, 0.0.0.0:8080->80/tcp, :::8080->80/tcp clearml-webserver
7e2cf4462f44 allegroai/clearml:1 "/opt/clearml/wrappe…" 18 months ago Up 7 months 0.0.0.0:8008->8008/tcp, :::8008->8008/tcp, 8080-8081/tcp clearml-apiserver
f9fb8d59ed31 mongo:4.4.9 "docker-entrypoint.s…" 18 months ago Up 7 weeks 27017/tcp clearml-mongo
0d806b28816f allegroai/clearml:1 "/opt/clearml/wrappe…" 18 months ago Up 7 months 8008/tcp, 8080/tcp, 0.0.0.0:8081->8081/tcp, :::8081->8081/tcp clearml-fileserver
8010ea7e981e elasticsearch:7.6.2 "/usr/local/bin/dock…" 18 months ago Up 16 months 0.0.0.0:9200->9200/tcp, :::9200->9200/tcp, 9300/tcp clearml-elastic
4e267b110dfc redis:5.0 "docker-entrypoint.s…" 18 months ago Up 16 months 6379/tcp clearml-redis
Can you please describe what working deployments you current have and what is you final goal?
Do you have an old deployment working or it was corrupted?
Do you want to upgrade that old deployment to a new one? Or you want to have a new deployment in some other place based on the data from the old deployment?
Can you please run 'sudo docker ps' and share the results? I want to see what is the real version of the apiserver image used
@<1526734383564722176:profile|BoredBat47> So your upgraded your source deployment to the latest clearml server, then backed up the data and tried to restore it on the target deployment that runs the same (latest) clearml server, correct? What is the error that you see in elastic logs on the target deployment?
The path then would be as following:
- Upgrade the old deployment to the latest clearml server according to the clearml server upgrade procedure. This will automatically upgrade the data
- Backup your data folders (mongo and elastic)
- Deploy the latest clearml server on another machine and restore the data from the backup