Unanswered
Problem With Clearml Serving Instance Cleanup
Hi Team, I’M Running Into An Issue With Clearml Serving Using The Helm Chart To Deploy Our Ml Models Via
Problem with ClearML Serving Instance Cleanup
Hi team, I’m running into an issue with ClearML Serving using the Helm chart to deploy our ML models via clearml-serving model add
. The deployment itself works fine, and the models are served as expected.
However, I’ve observed the following problems when managing model removal:
- Model Cleanup Issue : When I remove a model using
clearml-serving model remove
, the model data is not removed from ClearML Serving. Specifically, the temporary directory (/tmp
) where the inference container copies the model from/root/.clearml/cache
is not cleaned up. This accumulation continues until the disk space on the node is completely exhausted.
Potential Impact: - Nodes run out of space due to uncleaned
/tmp
directories.
Steps Taken: - Models added via
clearml-serving model add
and removed withclearml-serving model remove
.
Looking for: - Solutions or workarounds to automatically clean up the
/tmp
folder after model removal.
Has anyone faced similar issues or found effective solutions to these challenges? Any tips or best practices would be greatly appreciated!
129 Views
0
Answers
one month ago
one month ago
Tags
Similar posts