
Reputation
Badges 1
31 × Eureka!I feel like to do this, I need to create a recording rule from the metric avg(...) at the Prometheus level and then query increase(). However, this approach requires me to interact directly with Prometheus.
Thank you very much for your help. I will test it.
I will work on it and provide you with feedback. Do you have a list of monitoring metrics provided by clearml-serving?
@<1523701205467926528:profile|AgitatedDove14> , Thank you very much, I will follow your recommendation.
This p is not in the original code.
Now, when I add delta to calculate the variation of this: error: bad_data: 1:110: parse error: ranges only allowed for vector selectors
Thanks to the exception stack I examined, I understood that I had a model registry issue. I had used joblib to save the model file on my system, and I believed that the model registration in ClearML storage was automatic. So when I made the API call, the model path returned NoneType. Once I fixed that, I was able to serve my model and make API calls giving prediction results. Also, thanks to your help, I understood that I needed custom serving, and I was able to modify the preprocess.py file ...
I ran the test, but there was no result. I need to calculate a variation of average: avg(100*increase(test12_model_custom:Glucose_bucket[1m])/increase(test12_model_custom:Glucose_sum[1m])). The variation per minute. I tried using delta, but encountered an error.
Thank you, @<1523701070390366208:profile|CostlyOstrich36>
To check the data drift, I need to calculate the avg of the last query by time bucket and calculate the variation by minute of the new metric
How is the endpoint rehistred: clearml-serving --id 6c9c2c38e70b41e0a63547e3c16db234 model add --engine sklearn --endpoint "best_diabetes_detection" --preprocess "/home/caleb/diabetes_clearml/preprocess.py" --model-id e7532b8017ad4a0f92d5b537401f0585
Here it is: curl -X POST " None " -H "accept: application/json" -H "Content-Type: application/json" -d '{"Pregnancies": 6, "Glucose": 148, "BloodPressure": 72, "SkinThickness": 35, "Insulin": 0, "BMI": 33.6, "DiabetesPedigreeFunction": 0.627,"Age": 50}'
{"detail":"Error processing request: node array from the pickle has an incompatible dtype:\n- expected: [('left_child', '<i8'), ('right_child', '<i8'), ('feature', '<i8'), ('threshold', '<f8'...
Do you have any advice for this step, (monitoring)? I feel like it's not very well documented.
Alternatively, can I directly define my alert on avg(...)
@<1523701205467926528:profile|AgitatedDove14> , thank you very much for your help, I was able to fix most of my bugs thanks to your recommendations
I had already followed this tutorial, but the configuration of alerts was not covered.
from typing import Any
import numpy as np
Notice Preprocess class Must be named "Preprocess"
class Preprocess(object):
def init(self):
# set internal state, this will be called only once. (i.e. not per request)
pass
def preprocess(self, body: dict, state: dict, collect_custom_statistics_fn=None) -> Any:
# we expect to get two valid on the dict x0, and x1
return [[body.get("Pregnancies", None), body.get("Glucose", None), body.get("BloodPressure...
Is it in the serve instance task console that I should check the exception stack?
I have other similar endpoints for testing; that's why, if not, there is no error at this level. Even with the two endpoints, I get the same error. One clarification: I built my ML model with scikit-learn pipeline and Optuna. Now, by building another simple model without Optuna and the preprocessing pipeline with scikit-learn, that is, by simply using, for example, LogisticRegression().fit(X, y), I do not encounter any error for serving with clearml-serving; the request via its endpoint gives...
I test this: None , and I haven't encountered any error. I will test the custom example and I will provide you with feedback, thank you very much for your response.
I used this PromQL query: 100 * increase(test12_model_custom:Glucose_bucket[1m]) / increase(test12_model_custom:Glucose_sum[1m]) to visualize the distribution of the variable (in my case called Glucose). So according to your explanation, I should calculate a new metric: sum(abs(test12_model_custom:Glucose_bucket - histogram_avg(test12_model_custom:Glucose_bucket[1m]))). I set up the alert rule on this metric by defining a threshold to trigger the alert. Did I understand correctly?
Or the new metric should be: sum(abs((100 * increase(test12_model_custom:Glucose_bucket[1m]) / increase(test12_model_custom:Glucose_sum[1m])) - histogram_avg((100 * increase(test12_model_custom:Glucose_bucket[1m]) / increase(test12_model_custom:Glucose_sum[1m]))[1m])))?
When I calculated the average, I got this result. Now, with this new metric, I need to calculate the variation per minute. I tried increase, rate, delta, but no result, just an error: bad_data: 1:110: parse error: ranges only allowed for vector selectors: delta(avg(100*increase(test12_model_custom:Glucose_bucket[1m])/increase(test12_model_custom:Glucose_sum[1m]))[1m])