A Question regarding optimizing API usage for production monitoring
Hi, I'm currently using ClearML for monitoring a production environment where I report metrics during each inference.
Specifically, I monitor and report accuracy for around 5 features. My API usage has been very high, and I'm looking for ways to make it more efficient.
To reduce usage, I've already stopped reporting logs and machine performance metrics, but the API usage is still high. Here's a summary of my current setup:
- Metrics Reporting: I report accuracy for 5 features after each inference.
- Last Iteration Check: I make API requests to get the last iteration number.
- Persistent Connection: The connection to ClearML remains open all the time.I'm not sure whether it's better to keep the connection open or to close and reopen it multiple times during the process.
My questions are:
- What are the best practices for optimizing API usage in this scenario?
- Can I report multiple scalars/metrices on the same API call?
- Is it more efficient to keep the connection open or to reopen it periodically?Any advice or suggestions on how to further optimize my setup would be greatly appreciated. Thanks!