SweetBadger76
When running a training script and logging result into a task in clearml. I am using the command:Task.init(project_name=project_name, task_name=task_name)
The error is not immediate, it occurs from time to time during the training and the graphs on clearml are incomplete (to say the least).
Clearml version: 1.6.2
Server version: 1.2.0-153
LazyFish41
Thanks for the help. From your message I realized maybe the problem is that the folder in which I save the records is too large already, which was in fact the problem.
When I emptied the folder the error did not appear again.
Thanks for all the help 🙂
Hi VexedPeacock35 , I suspect that Elasticsearch works too hard and periodically misses timeouts on recording events. How much memory and CPU is it using? Can you increase the memory that is allocated to it and see whether this helps?
Hi VexedPeacock35
can you share some more precisions about what occurs ?
What are you trying to do (or to be precise when does this error appeared ?)
What are your packages versions (clearml, and server if you are self-hosted)