hi, thanks for reaching out. Getting desperate here.
Yes, its self hosted
No, only currently running experiments are deleted (task itself is gone, but debug images and models are present in fileserver folder)
What I do see is some random elastisearch errors popping up from time to time
[2024-01-05 09:16:47,707] [9] [WARNING] [elasticsearch] POST
None [status:N/A request:60.064s]
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 426, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 421, in _make_request
httplib_response = conn.getresponse()
File "/usr/local/lib/python3.9/http/client.py", line 1377, in getresponse
response.begin()
File "/usr/local/lib/python3.9/http/client.py", line 320, in begin
version, status, reason = self._read_status()
File "/usr/local/lib/python3.9/http/client.py", line 281, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/local/lib/python3.9/socket.py", line 704, in readinto
return self._sock.recv_into(b)
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/elasticsearch/connection/http_urllib3.py", line 251, in perform_request
response = self.pool.urlopen(
File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 726, in urlopen
retries = retries.increment(
File "/usr/local/lib/python3.9/site-packages/urllib3/util/retry.py", line 386, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/usr/local/lib/python3.9/site-packages/urllib3/packages/six.py", line 735, in reraise
raise value
File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 670, in urlopen
httplib_response = self._make_request(
File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 428, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 335, in _raise_timeout
raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host='elasticsearch', port='9200'): Read timed out. (read timeout=60)
Another thing I noticed that elastisearch folder has grown to gigantic size, is that normal? Can I clear it up somehow without problems?
Its 50GB currently