Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, We Are Currently Seeing The Following Error In Our Logs Of The Clearml Apiserver Pod:

Hi, we are currently seeing the following error in our logs of the ClearML apiserver pod:

[2024-03-20 15:33:32,089] [8] [WARNING] [elasticsearch] POST None [status:429 request:0.001s]
[2024-03-20 15:33:32,089] [8] [ERROR] [clearml.__init__] Failed processing worker status report
Traceback (most recent call last):
File "/opt/clearml/apiserver/bll/workers/__init__.py", line 153, in status_report
self.log_stats_to_es(
File "/opt/clearml/apiserver/bll/workers/__init__.py", line 557, in log_stats_to_es
es_res = elasticsearch.helpers.bulk( self.es _client, actions)
File "/usr/local/lib/python3.9/site-packages/elasticsearch/helpers/actions.py", line 410, in bulk
for ok, item in streaming_bulk(
File "/usr/local/lib/python3.9/site-packages/elasticsearch/helpers/actions.py", line 329, in streaming_bulk
for data, (ok, info) in zip(
File "/usr/local/lib/python3.9/site-packages/elasticsearch/helpers/actions.py", line 256, in _process_bulk_chunk
for item in gen:
File "/usr/local/lib/python3.9/site-packages/elasticsearch/helpers/actions.py", line 195, in _process_bulk_chunk_error
raise error
File "/usr/local/lib/python3.9/site-packages/elasticsearch/helpers/actions.py", line 240, in _process_bulk_chunk
resp = client.bulk(*args, body="\n".join(bulk_actions) + "\n", **kwargs)
File "/usr/local/lib/python3.9/site-packages/elasticsearch/client/utils.py", line 347, in _wrapped
return func(*args, params=params, headers=headers, **kwargs)
File "/usr/local/lib/python3.9/site-packages/elasticsearch/client/__init__.py", line 472, in bulk
return self.transport.perform_request(
File "/usr/local/lib/python3.9/site-packages/elasticsearch/transport.py", line 466, in perform_request
raise e
File "/usr/local/lib/python3.9/site-packages/elasticsearch/transport.py", line 427, in perform_request
status, headers_response, data = connection.perform_request(
File "/usr/local/lib/python3.9/site-packages/elasticsearch/connection/http_urllib3.py", line 291, in perform_request
self._raise_error(response.status, raw_data)
File "/usr/local/lib/python3.9/site-packages/elasticsearch/connection/base.py", line 328, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
elasticsearch.exceptions.TransportError: TransportError(429, 'circuit_breaking_exception', '[parent] Data too large, data for [<http_request>] would be [1057463944/1008.4mb], which is larger than the limit of [1020054732/972.7mb], real usage: [1057460904/1008.4mb], new bytes reserved: [3040/2.9kb], usages [inflight_requests=3040/2.9kb, request=0/0b, fielddata=9261/9kb, eql_sequence=0/0b, model_inference=0/0b]')
[2024-03-20 15:33:32,090] [8] [ERROR] [clearml.service_repo] Returned 500 for workers.status_report in 5ms, msg=General data error (Failed processing worker status report): err=429

I am not sure what to read out of this message: Is ClearML attempting to do a http request with nearly a GB of data?
I suspect that it has to something with an agent machine we recently added as a worker to the ClearML server but I do not understand where the big amount of data should come from as we have no tasks in the queue and only had one task in the queue (which was processed successfully) with around 1 MB of data.

  
  
Posted 8 months ago
Votes Newest

Answers 4


By the way, is there the possibility to decrease the log level of the api server and the file server? In the ClearML serving deployment a uvicorn log level environment variable can be set. Is there something similar available for the ClearML api and file server? I searched a little bit in the code and did not really find a place where the log level is defined

  
  
Posted 8 months ago

@<1649221394904387584:profile|RattySparrow90> how many workers do you have reporting in your system?

  
  
Posted 8 months ago

@<1523701087100473344:profile|SuccessfulKoala55> Only one

  
  
Posted 8 months ago

@<1673863775326834688:profile|SucculentMole19> FYI

  
  
Posted 8 months ago
534 Views
4 Answers
8 months ago
8 months ago
Tags
Similar posts