hhrrmm.. in the initial problem, you mentioned that the /var/lib/docker/overlay2 was growing large in size.. but.. 4GB seems "fine" for docker images.. I wonder .. does your nvme0n1p1 ever report like 85% or 90% used or do you think that the 4GB is a lot ? when you restart the server, does the % used noticeably drop ? that would suggest tmp files inside the docker image itself which.. is possible with docker (weird but, possible)
Morning, we got to 100% used which is what triggered this investigation. When we initially looked at overlay2 it was using 8GB, so now seems to be acceptable.
After making the change yesterday to the docker-compose file, the server is completely unusable - this is all I see for the /dashboard screen
I added this to each of the containers
logging:
options:
max-file: 5
max-size: 10m
It looks like not all the containers are up... Try sudo docker ps
and see if the apiserver container is restarting...
it looks like clearml-apiserver
and clearml-fileserver
are continually restarting
yeah, that's usually the case when you get an empty dashboard
btw - if you remove the docker-compose changes, do the containers start normally?
no, they are still rebooting. i've looked in /opt/clearml/logs/apiserver.log
no errors
Check sudo docker logs <container-name>
think I found the issue, a typo in apiserver.conf
back up and running again, thanks for your help