I’ve follow the installation steps that mentioned in this page
https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_linux_mac/
Then I replaced /opt/clearml/data
of ServerB by ServerA /opt/clearml/data
.
VictoriousPenguin97 basically spin down sereverA (this should flush all DBs) then copy /opt/clearml to the new server and spin it with docker-compose. As long as the new server is on the same address as the previous one, everything should work out of the box
I'm not entirely sure which steps you took and if you missed something. Elastic is complaining about permissions - Maybe you missed one of the steps?
Oh, I just realized that the mondo version between ServerA and ServerB is mismatch.
The problem was resolved by updating the mongo image to 4.0.23 as serverA.
After I diddocker-compose -f docker-compose.yml down docker-compose -f docker-compose.yml up -d
Then elasticsearch container got this error
ElasticsearchException[failed to bind service]; nested: IOException[failed to test writes in data directory [/usr/share/elasticsearch/data/nodes/0/indices/mQ-x_DoZQ-iZ7OfIWGZ72g/_state] write permission is required]; nested: AccessDeniedException[/usr/share/elasticsearch/data/nodes/0/indices/mQ-x_DoZQ-iZ7OfIWGZ72g/_state/.es_temp_file]; clearml-elastic | Likely root cause: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/nodes/0/indices/mQ-x_DoZQ-iZ7OfIWGZ72g/_state/.es_temp_file clearml-elastic | at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:90) clearml-elastic | at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106) clearml-elastic | at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) clearml-elastic | at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219) clearml-elastic | at java.base/java.nio.file.Files.newByteChannel(Files.java:380) clearml-elastic | at java.base/java.nio.file.Files.createFile(Files.java:658) clearml-elastic | at org.elasticsearch.env.NodeEnvironment.tryWriteTempFile(NodeEnvironment.java:1313) clearml-elastic | at org.elasticsearch.env.NodeEnvironment.assertCanWrite(NodeEnvironment.java:1284) clearml-elastic | at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:342) clearml-elastic | at org.elasticsearch.node.Node.<init>(Node.java:427) clearml-elastic | at org.elasticsearch.node.Node.<init>(Node.java:309) clearml-elastic | at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:234) clearml-elastic | at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:234) clearml-elastic | at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:434) clearml-elastic | at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:166) clearml-elastic | at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:157) clearml-elastic | at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:77) clearml-elastic | at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112) clearml-elastic | at org.elasticsearch.cli.Command.main(Command.java:77) clearml-elastic | at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:122) clearml-elastic | at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:80) clearml-elastic | For complete error details, refer to the log at /usr/share/elasticsearch/logs/clearml.log
Hello, after did the steps you mentioned https://clearml.slack.com/archives/CTK20V944/p1659702067809619?thread_ts=1659694970.919069&cid=CTK20V944
The server is now can start properly but Clearml UI doesn’t show any experiments that I cloned from serverA. Any suggestion? thank you!
And you have the exact same folder structure / content, and server A/B give a different set of experiments ?
(is serverB empty, meaning no experiments at all?)
For example, ServerA stores file at /opt/clearml but ServeB stores at /some_path/clearml
As long as you adjust your docker-compose yaml file, should be just fine
I mean migrating the data from serverA to serverB.
I just replace serverB with ServerA’s /opt/clearml/data
.
VictoriousPenguin97 I'm assuming the exact same server version ?
I already did
chmod 777 on /opt/clearml/data
or there’s other folders I need to grant the permission
Did I migrate the data correctly using the steps I took?
Is it ok if the path of ServerA and ServerB is difference.
For example, ServerA stores file at /opt/clearml but ServeB stores at /some_path/clearml
Looks like a permissions issue:nested: IOException[failed to test writes in data directory [/usr/share/elasticsearch/data/nodes/0/indices/mQ-x_DoZQ-iZ7OfIWGZ72g/_state] write permission is required]; nested