Reputation
Badges 1
10 × Eureka!Hi @<1542316991337992192:profile|AverageMoth57> ,
were you able to resolve this issue, as I am also facing the same for all type of models/frameworks except Keras, even after saving the model to disk?
I understood this, but still I have few doubts. Like what would be the exact query given an endpoint, for requests per sec.
Also, for the example you gave, I got the query up and running for it. Let's say I want a query to get the feature value (x and y in your example) distribution over some duration of time, then what should be the query, I tried endpoint:x_bucket{"+inf"}[$duration]/endpoint:x_sum{"+inf"}[$duration]
and some other variations, but couldn't get it right. Can you help?
the one where I asked about the query for feature value distribution over time that can be executed to be shown in prometheus and grafana with the metrics that are currently getting scraped by prometheus from clearml-statistics
Well, I read this, but it is same as what I had done before.
The query here gives percentage of input data in each bucket over a period of time.
But my previous ques and other query are still not figured out.
yeah, I ran the example given in the docs as well as the one given in their asteroid blog repo.
Agreed with your answer. I mistook the given example query in the tutorial as something else rather than the feature distribution over time.
My next question is that what can be the other relevant queries that we can visualize (in grafana), which will help in monitoring the served model and the end-user. So, I wanted the queries for that, like can we have a query for K-L divergence from the available metrics (that prometheus scraped from clearml-serving-statistics), and if yes, then what is t...
so, this allows us to define buckets for the histogram distribution, as given in the example docs for monitoring, but apart from that what exactly can we add? eg. I want to view feature value distribution over an interval, and baseline distribution of training and test set, how can I do with the cli tool, or do I need to make changes in the original serving code?
Ok, I'll look into this.
Thanks.
Ya grafana or Prometheus (promql query)
But it's not clear that what all other queries and metrics can/should be considered for serving tasks