Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi Guys, How To Solve This Problem:

Hi guys, how to solve this problem:

Returned 500 for events.add_batch in 10ms, msg=General data error (TransportError(429, 'circuit_breaking_exception', '[parent] Data too large, data for [<http_request>] would be [2028564646/1.8gb], which is larger than the limit of [2006987571/1.8gb], real usage: [2028536056/1.8gb], new bytes reserved: [28590/27.9kb], usages [request=0/0b, fielddata=848/848b, in_flight_requests=28590/27.9kb, accounting=567634/554.3kb]')Should i clean the elastic cache data? If i do so, will it clean my plot? Should i increase the limit of elastic fielddata.limit? Or how can i prevent i happens again.

  
  
Posted 4 years ago
Votes Newest

Answers 9


SuccessfulKoala55
I have manually control the number of data under 800K, because i found the budget would be 0 if len(series_sizes) = 1, https://github.com/allegroai/trains/blob/master/trains/utilities/plotly_reporter.py#L101

  
  
Posted 4 years ago

SuccessfulKoala55
Even if with Logger.report_scatter2d() the result is still so large ,and i found where the digits change: https://github.com/allegroai/trains/blob/master/trains/utilities/plotly_reporter.py#L122
Tolist will change the digits , but i haven't figure out why.

  
  
Posted 4 years ago

Logger.report_scatter2d() tries to find the optimal number of digits. Notice it will make sure the number of points will not exceed 800k but the limit is not on the actual stored data.
How many points do you have?
You can try using the latest version from GitHub, use:
pip install -e git+ https://github.com/allegroai/trains.git@master#egg=trains

  
  
Posted 4 years ago

And actually the problem here is round doesn't work before tolist

  
  
Posted 4 years ago

SubstantialBaldeagle49 if you want to be safe, you can use Logger.report_scatter2d() which makes sure the plot won't exceed the allowed limit. If you use your own code to create the scatter and report using Logger.report_plotly() , it would be better to call Logger.report_plotly() for each scatter (as opposed to 4 scatters in the same plot your code above) - since the plot size limit is calculated per plot, you'll have 4 plots of smaller size instead of one huge plot...

  
  
Posted 4 years ago

SuccessfulKoala55
Do you mean even if the json is so large, if i use Logger.report_scatter2d() , it wont cause TransportError(429, 'circuit_breaking_exception ?

  
  
Posted 4 years ago

As I think was already answered, this can be addressed by rounding down the numbers and making sure the plot size remains reasonable

  
  
Posted 4 years ago

Hi SuccessfulKoala55 :
I have make sure that all my data are roud to 4, but i still found my plotly data json is so large. And after checking the json ,i found there are many data with many digits, maybe those are info of plotly?
Here is my code:
from plotly.subplots import make_subplots import plotly.graph_objects as go def draw_pr(self,precisions,recalls,score,distance,dataset): score = np.round(score,4) for i in range(4): pre = np.around(precisions[i], 4) recall = np.around(recalls[i], 4) acc = np.around(np.multiply(pre, recall), 4) fig = make_subplots( rows=1, cols=2, subplot_titles=('Precision x Recall curve {}'.format(distance[i]), 'Precision,Recall X score curve {}'.format(distance[i])), ) fig.add_trace(go.Scatter(x=recall, y=pre, name='pr', mode='lines'), row=1, col=1) fig.add_trace(go.Scatter(x=score, y=pre, name='pre', mode='lines'), row=1, col=2) fig.add_trace(go.Scatter(x=score, y=recall, name='recall', mode='lines'), row=1, col=2) fig.add_trace(go.Scatter(x=score, y=acc, name='Acc', mode='lines'), row=1, col=2) self.logger.report_plotly(dataset + '_' + distance[i], 'pr', iteration=0, figure=fig)Here is the result:

  
  
Posted 4 years ago

Hi SubstantialBaldeagle49 , I think the issue here is the sheer size of the plot, not any field limit or any caching issue in Elastic

  
  
Posted 4 years ago
983 Views
9 Answers
4 years ago
one year ago
Tags