Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hey, So We Noticed The

Hey, so we noticed the /var/lib/docker/overlay2 directory on the clearml-server is growing a lot in size, we added more disk space but we want to put something in place to stop this growing too much.
These are the options I’ve looked into:

  1. docker system prune - removes all stopped containers, all networks not used by at least one container, all dangling images, all dangling build cache, Problem: we don’t really know what this is pruning
  2. docker image prune --all - removes all images without at least one container associated to them
  3. Set the max-size in docker-compose.yaml for logging

Are the first 2 options safe to run without killing the server? I’m not happy on removing files without knowing what they are.
Are there any plans to automate this in the future?

  
  
Posted 4 years ago
Votes Newest

Answers 43


it looks like clearml-apiserver and clearml-fileserver are continually restarting

  
  
Posted 4 years ago

back up and running again, thanks for your help

  
  
Posted 4 years ago

also, is there a list anywhere with the mount points that are needed?

  
  
Posted 4 years ago

thanks @<1523715084633772032:profile|AlertBlackbird30> this is really informative. Nothing seems to be particularly out of the ordinary though

3.7G	/var/lib/
3.7G	/var/lib/docker
3.0G	/var/lib/docker/overlay2

followed by a whole load of files that are a few hundred KBs in size, nothing huge though

  
  
Posted 4 years ago

incidentally we turn off the server every evening as it's not used overnight, we've not faced issues with it starting up in the morning or noticed any data loss

  
  
Posted 4 years ago

no, they are still rebooting. i've looked in /opt/clearml/logs/apiserver.log no errors

  
  
Posted 4 years ago

not entirely sure on this as we used the custom AMI solution, is there any documentation on it?

  
  
Posted 4 years ago

Morning, we got to 100% used which is what triggered this investigation. When we initially looked at overlay2 it was using 8GB, so now seems to be acceptable.

  
  
Posted 4 years ago

🤔 i'll add the logging max_size now and monitor over the next week

  
  
Posted 4 years ago

hhrrmm.. in the initial problem, you mentioned that the /var/lib/docker/overlay2 was growing large in size.. but.. 4GB seems "fine" for docker images.. I wonder .. does your nvme0n1p1 ever report like 85% or 90% used or do you think that the 4GB is a lot ? when you restart the server, does the % used noticeably drop ? that would suggest tmp files inside the docker image itself which.. is possible with docker (weird but, possible)

  
  
Posted 4 years ago

not yet, going to try and fix it today.

if I do a df I see this, which is concerning:

Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        3.9G     0  3.9G   0% /dev
tmpfs           3.9G     0  3.9G   0% /dev/shm
tmpfs           3.9G  928K  3.9G   1% /run
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/nvme0n1p1   20G  7.9G   13G  40% /
tmpfs           790M     0  790M   0% /run/user/1000

so it looks like the mount points are not created. When do these get created? I thought using an AMI these would have already been setup?

  
  
Posted 4 years ago

so am I right in thinking it's just the mount points that are missing?based on the output of df above

  
  
Posted 4 years ago

... from the AMI creation script:

# prepare directories to store data
sudo mkdir -p /opt/clearml/data/elastic_7
sudo mkdir -p /opt/clearml/data/redis
sudo mkdir -p /opt/clearml/data/mongo/db
sudo mkdir -p /opt/clearml/data/mongo/configdb
sudo mkdir -p /opt/clearml/logs
sudo mkdir -p /opt/clearml/config
sudo mkdir -p /opt/clearml/data/fileserver
sudo chown -R 1000:1000 /opt/clearml/data/elastic_7

So it seems the AMI is using the correct directories... Do you have these?

  
  
Posted 4 years ago
112K Views
43 Answers
4 years ago
one year ago
Tags