So the issue seems to be:{"type": "server", "timestamp": "2022-04-06T02:39:16,999Z", "level": "ERROR", "component": "o.e.b.ElasticsearchUncaughtExceptionHandler", "cluster.name": "clearml", "node.name": "clearml", "message": "uncaught exception in thread [main]", "stacktrace": ["org.elasticsearch.bootstrap.StartupException: ElasticsearchException[failed to bind service]; nested: AccessDeniedException[/usr/share/elasticsearch/data/nodes];", "at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:170) ~[elasticsearch-7.16.2.jar:7.16.2]", "at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:157) ~[elasticsearch-7.16.2.jar:7.16.2]", "at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:77) ~[elasticsearch-7.16.2.jar:7.16.2]", "at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112) ~[elasticsearch-cli-7.16.2.jar:7.16.2]", "at org.elasticsearch.cli.Command.main(Command.java:77) ~[elasticsearch-cli-7.16.2.jar:7.16.2]", "at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:122) ~[elasticsearch-7.16.2.jar:7.16.2]", "at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:80) ~[elasticsearch-7.16.2.jar:7.16.2]", "Caused by: org.elasticsearch.ElasticsearchException: failed to bind service", "at org.elasticsearch.node.Node.<init>(Node.java:1090) ~[elasticsearch-7.16.2.jar:7.16.2]", "at org.elasticsearch.node.Node.<init>(Node.java:309) ~[elasticsearch-7.16.2.jar:7.16.2]", "at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:234) ~[elasticsearch-7.16.2.jar:7.16.2]", "at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:234) ~[elasticsearch-7.16.2.jar:7.16.2]", "at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:434) ~[elasticsearch-7.16.2.jar:7.16.2]", "at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:166) ~[elasticsearch-7.16.2.jar:7.16.2]", "... 6 more", "Caused by: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/nodes", "at sun.nio.fs.UnixException.translateToIOException(UnixException.java:90) ~[?:?]", "at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106) ~[?:?]", "at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) ~[?:?]", "at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:398) ~[?:?]", "at java.nio.file.Files.createDirectory(Files.java:700) ~[?:?]", "at java.nio.file.Files.createAndCheckIsDirectory(Files.java:807) ~[?:?]", "at java.nio.file.Files.createDirectories(Files.java:793) ~[?:?]", "at org.elasticsearch.env.NodeEnvironment.lambda$new$0(NodeEnvironment.java:300) ~[elasticsearch-7.16.2.jar:7.16.2]", "at org.elasticsearch.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:224) ~[elasticsearch-7.16.2.jar:7.16.2]", "at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:298) ~[elasticsearch-7.16.2.jar:7.16.2]", "at org.elasticsearch.node.Node.<init>(Node.java:427) ~[elasticsearch-7.16.2.jar:7.16.2]", "at org.elasticsearch.node.Node.<init>(Node.java:309) ~[elasticsearch-7.16.2.jar:7.16.2]", "at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:234) ~[elasticsearch-7.16.2.jar:7.16.2]", "at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:234) ~[elasticsearch-7.16.2.jar:7.16.2]", "at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:434) ~[elasticsearch-7.16.2.jar:7.16.2]", "at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:166) ~[elasticsearch-7.16.2.jar:7.16.2]", "... 6 more"] }
Which might indicate that perhaps the clearml data dir does not have the right permissions or owner - did you do sudo chown -R 1000:1000 /opt/clearml
?
Also, can you do ls -la /opt/clearml/data
and share the results?
Hi AbruptDeer98 , can you please attach the logs you get from sudo docker logs clearml-elastic
? I assume there's an ES disk issue
Hi SuccessfulKoala55 ,
Sorry for late reply. When I come back from my vacation, I find I can not visit webserver at port 8080
. So I typed docker-compose -f docker-compose.yml down
to stop the containers and re-run docker-compose -f docker-compose.yml up
to restart the ClearML. However, when I visit webserver as normal, it pops up a window as figure attached. And I find the clearml-elastic
container keeps restarting, and the terminal prompts the errors as attached in .txt
file.
It seems that there are some permission problems with clearml_elastic
.
When I type <server_host_ip>:8008
at a new tab in browser, it refused to connect.
How can I solve this problem? Thank you.
Thanks for your reply SuccessfulKoala55 ! Here's the logs.
Hi AbruptDeer98 ,
This looks like the WebApp on your browser fails to access the server - when happens when you type http://<server-address>:8008
in a new browser tab?
Thanks for your help! SuccessfulKoala55 , I re-done the sudo chown -R 1000:1000 /opt/clearml
and restart the docker containers, everything goes well!
Now I have a new issue. When I create a new credential in my profile, the files_server
conf automaticlly generated still direct to the port 8081
, despite I change the port mapping in docker-compose.yml
from 8081
to 8084
as I mentioned before. Is there any way around to make it give the mapped port?