Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

Answered

Hi, I just updated clearml-server to 1.1.0 and got the following error when starting it with docker-compose:

clearml-apiserver | [2021-08-02 13:37:09,852] [8] [INFO] [clearml.es_factory] Using override elastic host ip-10-101-1-1.eu-central-1.compute.internal clearml-apiserver | [2021-08-02 13:37:09,852] [8] [INFO] [clearml.es_factory] Using override elastic port 9200 clearml-apiserver | [2021-08-02 13:37:10,076] [8] [INFO] [clearml.redis_manager] Using override redis host redis clearml-apiserver | [2021-08-02 13:37:10,076] [8] [INFO] [clearml.redis_manager] Using override redis port 6379 clearml-apiserver | [2021-08-02 13:37:10,139] [8] [INFO] [clearml.schema_reader] loading schema from cache clearml-apiserver | [2021-08-02 13:37:10,218] [8] [INFO] [clearml.app_sequence] ################ API Server initializing ##################### clearml-apiserver | [2021-08-02 13:37:10,218] [8] [INFO] [clearml.database] Initializing database connections clearml-apiserver | [2021-08-02 13:37:10,218] [8] [INFO] [clearml.database] Using override mongodb host mongo clearml-apiserver | [2021-08-02 13:37:10,218] [8] [INFO] [clearml.database] Using override mongodb port 27017 clearml-apiserver | [2021-08-02 13:37:10,220] [8] [INFO] [clearml.database] Registering connection to auth-db ( ) clearml-apiserver | [2021-08-02 13:37:10,221] [8] [INFO] [clearml.database] Registering connection to backend-db ( ) clearml-apiserver | [2021-08-02 13:37:10,229] [8] [WARNING] [elasticsearch] GET [status:404 request:0.007s] clearml-apiserver | [2021-08-02 13:37:10,229] [8] [ERROR] [clearml.initialize] NotFoundError(404, '{}') clearml-apiserver | Traceback (most recent call last): clearml-apiserver | File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main clearml-apiserver | "__main__", mod_spec) clearml-apiserver | File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code clearml-apiserver | exec(code, run_globals) clearml-apiserver | File "/opt/clearml/apiserver/server.py", line 10, in <module> clearml-apiserver | AppSequence(app).start(request_handlers=RequestHandlers()) clearml-apiserver | File "/opt/clearml/apiserver/server_init/app_sequence.py", line 40, in start clearml-apiserver | self._init_dbs() clearml-apiserver | File "/opt/clearml/apiserver/server_init/app_sequence.py", line 90, in _init_dbs clearml-apiserver | and get_last_server_version() < Version("0.16.0") clearml-apiserver | TypeError: '<' not supported between instances of 'Version' and 'Version' clearml-apiserver | Loading config from /opt/clearml/apiserver/config/default clearml-apiserver | Loading config from file /opt/clearml/apiserver/config/default/apiserver.conf clearml-apiserver | Loading config from file /opt/clearml/apiserver/config/default/secure.conf clearml-apiserver | Loading config from file /opt/clearml/apiserver/config/default/hosts.conf clearml-apiserver | Loading config from file /opt/clearml/apiserver/config/default/logging.conf clearml-apiserver | Loading config from file /opt/clearml/apiserver/config/default/services/auth.conf clearml-apiserver | Loading config from file /opt/clearml/apiserver/config/default/services/tasks.conf clearml-apiserver | Loading config from file /opt/clearml/apiserver/config/default/services/projects.conf clearml-apiserver | Loading config from file /opt/clearml/apiserver/config/default/services/events.conf clearml-apiserver | Loading config from file /opt/clearml/apiserver/config/default/services/organization.conf clearml-apiserver | Loading config from /opt/clearml/config clearml-apiserver | Loading config from file /opt/clearml/config/apiserver.conf

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

Votes Newest

Answers 30

Should I try to disable dynamic mapping before doing the reindex operation?

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

Not of the ES cluster, I only created a backup of the clearml-server instance disk, I didn’t think there could be a problem with ES…

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

my docker-compose for the master node of the ES cluster is the following:
` version: "3.6"
services:

elasticsearch:
container_name: clearml-elastic
environment:
ES_JAVA_OPTS: -Xms2g -Xmx2g
bootstrap.memory_lock: "true"
cluster.name: clearml-es
cluster.initial_master_nodes: clearml-es-n1, clearml-es-n2, clearml-es-n3
cluster.routing.allocation.node_initial_primaries_recoveries: "500"
cluster.routing.allocation.disk.watermark.low: 500mb
cluster.routing.allocation.disk.watermark.high: 500mb
cluster.routing.allocation.disk.watermark.flood_stage: 500mb
node.name: clearml-es-n1
network.host: 10.105.1.2
discovery.seed_hosts: 10.105.1.2, 10.105.1.3, 10.105.1.4
http.compression_level: "7"
reindex.remote.whitelist: '.'
xpack.monitoring.enabled: "false"
xpack.security.enabled: "false"
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
image: docker.elastic.co/elasticsearch/elasticsearch:7.6.2
restart: unless-stopped
volumes:
- /usr/share/elasticsearch
network_mode: host `

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

I'm not sure... what OS are you using? We're usually using volumes specifically mounted to EBS volumes

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

I’ve reindexed the data for the logs, now the mappings are correct but I am missing one month of data, I have literally no idea where this data is/how it disappeared

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

Is it safe to turn off replication while a reindex operation is happening? the reindexing is rather slow and I am wondering if turning of replication will speed up the process

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

I’ve set dynamic: “strict” in the template of the logs index and I was able to keep the same mapping after doing the reindex

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

Ok I have a very different problem now: I did the following to restart the ES cluster:
docker-compose down docker-compose up -dAnd now the cluster is empty. I think docker simply created a new volume instead of reusing the previous one, which was always the case so far.

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

Did you back up the data before your changes?

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Thanks!

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

I see that I have several volumes:
$ docker volume ls DRIVER VOLUME NAME local 5b0bfe5ab1a3d645bd635b2fb6f2aefd2b657d566019343c8305959903996c67 local 43b60287d60db798dc9d1defe1d7d861334c9c8299aefad6da2f20db278cfc5b local 1406d50aa65ab55d323500d1fb23f19adfc6e721261ab6103a59d20e82146099 local 7367a215bd42a4e888e5d88ce708bf74aedc48a6e9417c72a19739cb80f25e6d local 7413c39f5e4b6568304832d9d2e925ebdbf47ad31ad22d77830d3618af79237b local a55cb71edff48c2138a5da9d8d1e26df3b8454f9c3b1ac3b64692a4db5102600 local b6cd9aaeddb2113dda61585318dd031998cc5b8c4a4609470df62faf269713a3 local fd4bc62663c8560a7e59d6acb5bdda9e88e4659ac504af7477c144f105c284d4

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

Yes, it is safe to put number_of_replicas to 0 and refresh_interval to -1 for the target index before the reindex and then put them back after the reindex is finished

  				
Posted 
	3 years ago

					More  		
  Report
		
					AppetizingMouse58
				
					0

should I try to roll back to clearml-server 1.0.2? I am very anxious now…

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

This one is indeed dynamic but can be set as follows: "plot_len":{"type":"long"}

  				
Posted 
	3 years ago

					More  		
  Report
		
					AppetizingMouse58
				
					0

the reindexing operation showed no error and copied everything

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

What ES version are you using? Maybe that's related?

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

amazon linux

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

AppetizingMouse58 the events_plot.json template misses the plot_len declaration, could you please give me the definition of this field? (reindexing with dynamic: strict fails with: "mapping set to strict, dynamic introduction of [plot_len] within [_doc] is not allowed )

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

Sorry, I have no real experience with docker-managed volumes 😞

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Hi JitteryCoyote63 , are you still missing a month of data in the event logs? If you do cat indices do you see the same amount of docs in the original and the new ones?

  				
Posted 
	3 years ago

					More  		
  Report
		
					AppetizingMouse58
				
					0

I made sure before deleting the old index that the number of docs matched

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

AppetizingMouse58 Yes and yes

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

I did change the replica setting on the same index yes, I reverted it back from 1 to 0 afterwards

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

We can compare with the table that you sent yesterday. Unless a lot of new events were written since then

  				
Posted 
	3 years ago

					More  		
  Report
		
					AppetizingMouse58
				
					0

AppetizingMouse58 btw I had to delete the old logs index before creating the alias, otherwise ES won’t let me create an alias with the same name as an existing index

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

But you did change the replica setting - did you reindex?

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

SuccessfulKoala55 I am using ES 7.6.2

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

I reindexed only the logs to a new index afterwards, I am now doing the same with the metrics since they cannot be displayed in the UI because of their wrong dynamic mappings

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

So it can be that when restarting the docker-compose, it used another volume, hence the loss of data

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

Now I am trying to restart the cluster with docker-compose and specifying the last volume, how can I do that?

  				
Posted 
	3 years ago

					More  		
  Report
		
					JitteryCoyote63
				
					0
					 × 1

Write your answer

1K Views

30 Answers

3 years ago

2 years ago