SuccessfulKoala55
I have manually control the number of data under 800K, because i found the budget would be 0 if len(series_sizes)
= 1, https://github.com/allegroai/trains/blob/master/trains/utilities/plotly_reporter.py#L101
SuccessfulKoala55
Even if with Logger.report_scatter2d()
the result is still so large ,and i found where the digits change: https://github.com/allegroai/trains/blob/master/trains/utilities/plotly_reporter.py#L122Tolist
will change the digits , but i haven't figure out why.
Logger.report_scatter2d()
tries to find the optimal number of digits. Notice it will make sure the number of points will not exceed 800k but the limit is not on the actual stored data.
How many points do you have?
You can try using the latest version from GitHub, use:pip install -e git+
https://github.com/allegroai/trains.git@master#egg=trains
And actually the problem here is round doesn't work before tolist
SubstantialBaldeagle49 if you want to be safe, you can use Logger.report_scatter2d()
which makes sure the plot won't exceed the allowed limit. If you use your own code to create the scatter and report using Logger.report_plotly()
, it would be better to call Logger.report_plotly()
for each scatter (as opposed to 4 scatters in the same plot your code above) - since the plot size limit is calculated per plot, you'll have 4 plots of smaller size instead of one huge plot...
SuccessfulKoala55
Do you mean even if the json is so large, if i use Logger.report_scatter2d()
, it wont cause TransportError(429, 'circuit_breaking_exception
?
As I think was already answered, this can be addressed by rounding down the numbers and making sure the plot size remains reasonable
Hi SuccessfulKoala55 :
I have make sure that all my data are roud to 4, but i still found my plotly data json is so large. And after checking the json ,i found there are many data with many digits, maybe those are info of plotly?
Here is my code:from plotly.subplots import make_subplots import plotly.graph_objects as go def draw_pr(self,precisions,recalls,score,distance,dataset): score = np.round(score,4) for i in range(4): pre = np.around(precisions[i], 4) recall = np.around(recalls[i], 4) acc = np.around(np.multiply(pre, recall), 4) fig = make_subplots( rows=1, cols=2, subplot_titles=('Precision x Recall curve {}'.format(distance[i]), 'Precision,Recall X score curve {}'.format(distance[i])), ) fig.add_trace(go.Scatter(x=recall, y=pre, name='pr', mode='lines'), row=1, col=1) fig.add_trace(go.Scatter(x=score, y=pre, name='pre', mode='lines'), row=1, col=2) fig.add_trace(go.Scatter(x=score, y=recall, name='recall', mode='lines'), row=1, col=2) fig.add_trace(go.Scatter(x=score, y=acc, name='Acc', mode='lines'), row=1, col=2) self.logger.report_plotly(dataset + '_' + distance[i], 'pr', iteration=0, figure=fig)
Here is the result:
Hi SubstantialBaldeagle49 , I think the issue here is the sheer size of the plot, not any field limit or any caching issue in Elastic