not entirely sure on this as we used the custom AMI solution, is there any documentation on it?
I think that if these directories are not mounted, you should first of all take care not to shut down the server. You'll probably want to exec /bin/bash
into the mongo
and elastic
containers, and copy their data outside to the host storage
After making the change yesterday to the docker-compose file, the server is completely unusable - this is all I see for the /dashboard screen
🤔 i'll add the logging max_size now and monitor over the next week
Basically whatever was under the old /opt/trains/
folder is required, you can see the list here: None
you will probably want to find the culprit, so a find should work wonders. I probably suspect elasticsearch first. It tends to go nuts 😕
we turn off the server every evening...
In that case the issue is definitely not related to the mount points
so am I right in thinking it's just the mount points that are missing?based on the output of df
above
Oh, that's strange. I'll run one of those soon to see if there's anything wrong with them
no, they are still rebooting. i've looked in /opt/clearml/logs/apiserver.log
no errors
it looks like clearml-apiserver
and clearml-fileserver
are continually restarting
also, is there a list anywhere with the mount points that are needed?
think I found the issue, a typo in apiserver.conf
hey @<1687643893996195840:profile|RoundCat60> .. did you ever get the problem sorted ?
incidentally we turn off the server every evening as it's not used overnight, we've not faced issues with it starting up in the morning or noticed any data loss
Howdy and Morning @<1687643893996195840:profile|RoundCat60> .. docker when using overlay2 doesn't have it's mount points show up in a 'df' btw, they will only appear in a 'df -a', mostly because since they are simply 'overlays', they don't (technically) consume any space (I mean, the files are still in the /var/lib but not for the space counting practices used by df)
this is why I was suggesting a find, maybe with a 'du' .. actually.. let me try that here.. 2s
thanks @<1523715084633772032:profile|AlertBlackbird30> this is really informative. Nothing seems to be particularly out of the ordinary though
3.7G /var/lib/
3.7G /var/lib/docker
3.0G /var/lib/docker/overlay2
followed by a whole load of files that are a few hundred KBs in size, nothing huge though
@<1687643893996195840:profile|RoundCat60> you set it once, inside the docker-compose itself.. it will affect all docker containers but, to be honest, docker tends to log everything
strange, I used one of the publicly available AMIs for ClearML (we did not upgrade from the Trains AMI as started fresh)
btw - if you remove the docker-compose changes, do the containers start normally?
Not necessarily, is there any data in those directories?
Hi @<1687643893996195840:profile|RoundCat60> ,
We've actually never had to address this issue. Can you find out what exactly is growing in size? I'd like to make sure this is not due to the containers storing data internally (causing docker to store more and more snapshots) - this is an unhealthy situation that might also indicate that volumes are not mounted correctly (i.e. data that should be stored externally is actually stored internally)
Hey there waves
Not sure about plans to automate this in the future, as this is more how docker behaves and not really clearml, especially with the overlay2 filesystem. The biggest offender usually is your json logfiles. have a look in /var/lib/docker/containers/ for *.log
assuming this IS the case, you can tell docker to only log upto a max-size .. I have mine set to 100m or some such
In the publicly available AMI these are created. However, if you used a previously released Trains AMI and upgraded to ClearML, part of the upgrade process was to create those directories (required by the new docker-compose.yml
), as explained here: None
Can you perhaps attach your docker-compose.yml
file's contents?
I believe you can set it on a 'per container' way as well.
container_name:
logging:
options:
max-size: 10m
Check sudo docker logs <container-name>