Did We Hit The Clearml Self-Hosted Capacity? We Have 150Gb Of Scalar Index In Elasticsearach And Evertyhting Is Failing Hard. Console Logs, Scalars, Plots Are Not Loading, Averything In Clearml Web App Lags, Freezes Or Fails To Load The Server Is Has 30Gb

Answered

did we hit the clearml self-hosted capacity? We have 150GB of scalar index in elasticsearach and evertyhting is failing hard. Console logs, scalars, plots are not loading, averything in clearml web app lags, freezes or fails to load
The server is has 30GB total RAM, 12 of which is dedicated to clearml, has 8 cores and has 250GB free disc space.
We often get elasticsearch failures with "all shards failed" where indexes fail
Mongo db has loads of slow query logs (but this is totally another problem i think)

Do you think this could be some sort of upper limit? We will try our best to fix this issue with something called rollover (some other dev knows how to do this)

  				
Posted 
	one year ago

					More
				  		
  Report
		
					AmiableSeaturtle81
				
					0
					 × 1

Votes Newest

Answers 2

Hi @<1590514584836378624:profile|AmiableSeaturtle81> , you'll need to share some ES and Mongo logs for that, but my guess is that the ES instance is struggling to handle the large indices you most likely have there. Mongo should not be ab issue, unless you have a huge number is tasks stored there (and even than it's always am issue of RAM)

  				
Posted 
	one year ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Is it possible to split the large elasticsearch indexes? I know elasticsearch has something called rollover, but im not sure that clearml supports this

  				
Posted 
	one year ago

					More
				  		
  Report
		
					AmiableSeaturtle81
				
					0
					 × 1

Write your answer

1K Views

2 Answers

one year ago