Disclaimer, Not exactly a clearml question. Just wasn't getting a response when asked in the other channel. Anyway here it goes.
This is not tool specific. More of a general MLOps question.
Given you have a classification model that you plan to do CT on. You have data coming in a stream. Maybe you pull data on a time basis or you pull data for training when you have n samples in the batch. How would you consider evaluating the model e.g if you're evaluating it on accuracy for simplicity reasons. Do you split the batch into train and test? That would mean you're not using all the available data you get to train the model.
What I've just thought of as a demo is, you train the model and then deploy it to staging maybe. Then, when a new batch of data comes, you first evaluate the model that is already deployed to staging and production on that batch of data. Then you train the model which is in staging on that batch of data.
How would you set up an evaluation mechanism in your MLOps pipeline I'm curious.