Hey! I Stumbled Upon Some Errors With My Workers Monitoring.
I Checked Logs In My K8S Pods For Apiserver And Elasticsearch And It Seems The Problem Is There. These Are The Logs:
Apiserver Logs
[2021-04-23 06:19:50,209] [9] [Error] [Trains.Service_Repo] Re
Unfortunately the problem was not resolved nor by changing the vm memory settings back to 2 gb and by going back from azurefiles persistent volumes to hostPath. Seems odd as I did not have any of these issues before. I thought it might come from the changes in PV and elasticsearch settings but going back to the original settings did not resolve the issue. Shouldn't I be using the latest tag for clearml?
3 years ago
one year ago