Hi everyone!
I’m building notification\cleaning service on top of ClearML saved data. I use https://github.com/allegroai/clearml/blob/master/examples/services/cleanup/cleanup_service.py as an inspiration. And I wonder - is there any way to programatically get the size(storage consumption) for any Tasks metrics\artifacts\models(together or separately). Because I want to notify(and then remove) only some really large Tasks, and don’t touch those that don’t require a lot of space.
Posted 9 months ago
Thank you for fast reply. Could you suggest any opensourced examples even for such maybe a little bit complicated pipeline?

Posted 9 months ago

Well, you'll need to be familiar with the ES documents structure, and use the _size https://www.elastic.co/guide/en/elasticsearch/plugins/current/mapper-size-usage.html (which would probably require you to reindex the data). That's for ES.
For MongoDB, you'll need to use the $bsonSize https://www.mongodb.com/docs/upcoming/reference/operator/aggregation/bsonSize/ on the task documents.

Posted 9 months ago

Hi GreasyRaven35 , this actually takes some more infrastructure, and can only be done using ES plugins and specific MongoDB queries

Posted 9 months ago