Answered

Hello Everyone! I Launched An Ec2 Instance That'S Free Tier (

Hello everyone! I launched an EC2 instance that's free tier ( Ubuntu Server 20.04 LTS (HVM), SSD Volume Type) . I installed docker and docker-compose and then I wanted to setup my own self-hosted ClearML server and so I followed the instructions on https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_linux_mac

However when I go to my ip adddress:8080, i get this error:

` Server Unavailable

The ClearML server is currently unavailable.
Please try to reload this page in a little while.
If the problem persists, verify your network connection is working and check the ClearML server logs for possible errors `
How should I proceed? Thank you!

edit:

I looked into this a bit more and ... it looks like I need to setup some credentials according to https://clear.ml/docs/latest/docs/clearml_agent/#setting-server-credentials . How the https://clear.ml/docs/latest/docs/clearml_agent/deploying_clearml/clearml_server_security#user-access-security seems to be down. Where is the updated page at?

And when I do a logs of allegroai/clearml-agent-services:latest it says

clearml_agent: ERROR: Connection Error: it seems *api_server* is misconfigured. Is this the ClearML API server ?and
ERROR: Invalid requirement: 'clearml-agent">=0.17.0"'

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					CluelessElephant89
				
					0
					 × 1

Votes Newest

Answers 20

I believe I ran that vm command already

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					CluelessElephant89
				
					0
					 × 1

In any case I don't think that would be a reasonable server setup

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

I would first try curl http://localhost:8008 from the server console (i.e. ssh)

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

CluelessElephant89 , I think the RAM requirements for elastic might be 2GB, you can try the following hack so it maybe will work.

In the machine that it's running on there should be a docker-compose.yml file (I'm guessing at home directory).

For the following https://github.com/allegroai/clearml-server/blob/master/docker/docker-compose.yml#L41 you can try changing it to ES_JAVA_OPTS: -Xms1g -Xmx1g and this might limit the elastic memory to 1 gb, however please note this might not work.

After the change please lower and raise the dockers again with the docker compose command 🙂

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

oh wait, I don't see the 99-clearml.conf yet... let me try that before I kill this instance

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					CluelessElephant89
				
					0
					 × 1

CluelessElephant89 try the elastic search logs clearml-elastic

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

hi CostlyOstrich36 thanks for responding. So running that command I get

[2021-10-19 20:53:52,726] [9] [INFO] [clearml.app_sequence] ################ API Server initializing ##################### [2021-10-19 20:53:52,727] [9] [INFO] [clearml.database] Initializing database connections [2021-10-19 20:53:52,727] [9] [INFO] [clearml.database] Using override mongodb host mongo [2021-10-19 20:53:52,728] [9] [INFO] [clearml.database] Using override mongodb port 27017 [2021-10-19 20:53:52,729] [9] [INFO] [clearml.database] Registering connection to auth-db ( ) [2021-10-19 20:53:52,731] [9] [INFO] [clearml.database] Registering connection to backend-db ( ) [2021-10-19 20:53:52,736] [9] [WARNING] [clearml.initialize] Could not connect to ElasticSearch Service. Retry 1 of 4. Waiting for 30sec [2021-10-19 20:54:22,762] [9] [WARNING] [clearml.initialize] Could not connect to ElasticSearch Service. Retry 2 of 4. Waiting for 30sec [2021-10-19 20:54:52,771] [9] [WARNING] [clearml.initialize] Could not connect to ElasticSearch Service. Retry 3 of 4. Waiting for 30sec [2021-10-19 20:55:22,782] [9] [ERROR] [clearml.app_sequence] Error connecting to Elasticsearch: ConnectionError(<urllib3.connection.HTTPConnection object at 0x7fb66477f978>: Failed to establish a new connection: [Errno -2] Name or service not known) caused by: NewConnectionError(<urllib3.connection.HTTPConnection object at 0x7fb66477f978>: Failed to establish a new connection: [Errno -2] Name or service not known) Traceback (most recent call last): File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/opt/clearml/apiserver/server.py", line 10, in <module> AppSequence(app).start(request_handlers=RequestHandlers()) File "/opt/clearml/apiserver/server_init/app_sequence.py", line 40, in start self._init_dbs() File "/opt/clearml/apiserver/server_init/app_sequence.py", line 97, in _init_dbs "Error starting server: failed connecting to ElasticSearch service" Exception: Error starting server: failed connecting to ElasticSearch service Loading config from /opt/clearml/apiserver/config/default Loading config from file /opt/clearml/apiserver/config/default/apiserver.conf Loading config from file /opt/clearml/apiserver/config/default/hosts.conf Loading config from file /opt/clearml/apiserver/config/default/logging.conf Loading config from file /opt/clearml/apiserver/config/default/secure.conf Loading config from file /opt/clearml/apiserver/config/default/services/projects.conf Loading config from file /opt/clearml/apiserver/config/default/services/organization.conf Loading config from file /opt/clearml/apiserver/config/default/services/tasks.conf Loading config from file /opt/clearml/apiserver/config/default/services/events.conf Loading config from file /opt/clearml/apiserver/config/default/services/auth.conf Loading config from /opt/clearml/config [2021-10-19 20:55:25,367] [9] [INFO] [clearml.es_factory] Using override elastic host elasticsearch [2021-10-19 20:55:25,368] [9] [INFO] [clearml.es_factory] Using override elastic port 9200 [2021-10-19 20:55:25,636] [9] [INFO] [clearml.redis_manager] Using override redis host redis [2021-10-19 20:55:25,637] [9] [INFO] [clearml.redis_manager] Using override redis port 6379 [2021-10-19 20:55:25,740] [9] [INFO] [clearml.schema_reader] loading schema from cache [2021-10-19 20:55:25,832] [9] [INFO] [clearml.app_sequence] ################ API Server initializing ##################### [2021-10-19 20:55:25,833] [9] [INFO] [clearml.database] Initializing database connections [2021-10-19 20:55:25,833] [9] [INFO] [clearml.database] Using override mongodb host mongo [2021-10-19 20:55:25,834] [9] [INFO] [clearml.database] Using override mongodb port 27017 [2021-10-19 20:55:25,835] [9] [INFO] [clearml.database] Registering connection to auth-db ( ) [2021-10-19 20:55:25,837] [9] [INFO] [clearml.database] Registering connection to backend-db ( ) [2021-10-19 20:55:25,845] [9] [WARNING] [clearml.initialize] Could not connect to ElasticSearch Service. Retry 1 of 4. Waiting for 30sec

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					CluelessElephant89
				
					0
					 × 1

but you can try reducing ES to ES_JAVA_OPTS: -Xms500mb -Xmx500mb ?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Looks like the ElasticSearch service is down?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					CluelessElephant89
				
					0
					 × 1

CostlyOstrich36 uh oh... I think i need more memory...

There is insufficient memory for the Java Runtime Environment to continue.

Native memory allocation (mmap) failed to map 2060255232 bytes for committing reserved memory.

An error report file with more information is saved as:

logs/hs_err_pid59.log

error:
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x0000000085330000, 2060255232, 0) failed; error='Not enough space' (errno=12)
at org.elasticsearch.tools.launchers.JvmErgonomics.flagsFinal(JvmErgonomics.java:123)
at org.elasticsearch.tools.launchers.JvmErgonomics.finalJvmOptions(JvmErgonomics.java:88)
at org.elasticsearch.tools.launchers.JvmErgonomics.choose(JvmErgonomics.java:59)
at org.elasticsearch.tools.launchers.JvmOptionsParser.main(JvmOptionsParser.java:95) `

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					CluelessElephant89
				
					0
					 × 1

O geez, you're going to laugh, but Im using a ec2 free tier and it only gives me 1 GiB of memory

Well CostlyOstrich36 is also right 🙂 - I'm not sure the server will be able to handle running with only 1GB

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

CostlyOstrich36 SuccessfulKoala55 super late update, but it turns out I needed to beef up the machine. Thanks for all the help!

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					CluelessElephant89
				
					0
					 × 1

👍

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

CostlyOstrich36 O geez, you're going to laugh, but Im using a ec2 free tier and it only gives me 1 GiB of memory

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					CluelessElephant89
				
					0
					 × 1

CluelessElephant89 , did you run the vm.max_map_count command for elastic? Also what amount of RAM memory do you have on the machine you're running on?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

CluelessElephant89 actually having some errors on startup with ES is perfectly normal - it takes some time for ES to boot

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Okay thanks CostlyOstrich36 and SuccessfulKoala55 I'll beef up my server first and then run this again.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					CluelessElephant89
				
					0
					 × 1

CluelessElephant89 , I'd wager you might have missed one of the steps in the installation, probably permissions issue, I hope 🙂

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

CluelessElephant89 , Hi!

It looks like there is a problem with the API server. Can you please look for the docker logs and see what errors that it prints and paste here 🙂

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

CluelessElephant89 , the relevant command should be something of the sort sudo docker logs clearml-apiserver

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

Write your answer

2K Views

20 Answers

4 years ago

2 years ago