DeliciousStarfish67 , are you running your ClearML server on the aws instance?
Can you connect directly to the instance? If so, please check how large /opt/clearml is on the machine and then see the folder distribution
specially /opt/clearml/data/fileserver which is taking 102GB
I see. I'm guessing you have pretty extensive use in the form of artifacts/debug samples. You can lower the storage usage by deleting some experiments/models though the UI. That should free up some space 🙂
So your saying its expected and if I can't delete this data the only option is to keep increasing the volume size?
You can always delete the data. Each folder in /opt/clearml/data/fileserver/
represents the stored outputs of an experiment. If you no longer need the files you can delete them
DeliciousStarfish67 the math is simple - if you want the experiments outputs (in this case specifically - the debug images, uploaded artifacts and models), they simply take up storage space (as png/jpg images and whatever files you uploaded as artifacts or models). If you only want the metrics for each experiments, they are stored in a different location and so will not be affected if you delete fileserver data
Thank you guys.
SuccessfulKoala55 is there any way to configure clearml server to save debug images and artifacts to s3?
Certainly - you do that directly on the clients (SDK, agents)
Here:
https://clear.ml/docs/latest/docs/configs/clearml_conf#agent-section
What you're looking for is this:sdk.development.default_output_uri
Also configure your api.files_server
in ~/clearml.conf
to point to your s3 bucket as well 🙂