Hi @<1590514584836378624:profile|AmiableSeaturtle81> , you'll need to share some ES and Mongo logs for that, but my guess is that the ES instance is struggling to handle the large indices you most likely have there. Mongo should not be ab issue, unless you have a huge number is tasks stored there (and even than it's always am issue of RAM)
Answered
Did We Hit The Clearml Self-Hosted Capacity? We Have 150Gb Of Scalar Index In Elasticsearach And Evertyhting Is Failing Hard. Console Logs, Scalars, Plots Are Not Loading, Averything In Clearml Web App Lags, Freezes Or Fails To Load
The Server Is Has 30Gb
did we hit the clearml self-hosted capacity? We have 150GB of scalar index in elasticsearach and evertyhting is failing hard. Console logs, scalars, plots are not loading, averything in clearml web app lags, freezes or fails to load
The server is has 30GB total RAM, 12 of which is dedicated to clearml, has 8 cores and has 250GB free disc space.
We often get elasticsearch failures with "all shards failed" where indexes fail
Mongo db has loads of slow query logs (but this is totally another problem i think)
Do you think this could be some sort of upper limit? We will try our best to fix this issue with something called rollover (some other dev knows how to do this)
11 Views
1
Answer
2 days ago
one day ago
Tags