@<1526371986278715392:profile|VivaciousReindeer64> - please check the following:
- What do you get if you go to http//192.168.1.145:8080/configuration.json?
- Can you check the log of the webserver docker (using
sudo docker logs clearml-webserver) - especially the beginning? Does it say anything about the fileBaseUrl?
@<1529271098653282304:profile|WorriedRabbit94> - I will have someone from the team check which experiments are consuming the storage
@<1529271098653282304:profile|WorriedRabbit94> - I assume the issue is related to the to autoscaler instances that were running for a long time and produced a lot of logs. Please try the following:
Hi EnviousStarfish54 .
I'm trying to make sure I understand the scenario. What I undestood is that you add a custom column (metric) to the experiments table, sort by it and then refresh with F5. I wasn't able to reproduce this on the Demo site ( https://demoapp.trains.allegro.ai/projects/*/experiments?columns=selected&columns=type&columns=name&columns=tags&columns=status&columns=project.name&columns=users&columns=started&columns=last_update&columns=last_iteration&columns=m.5451af93e0bf68a4ab...
AbruptWorm50 - just to make sure there is no misunderstanding - the last image you sent is on the "training" queue and not on the "services" queue. Are there free agents running on that queue?
SuperiorPanda77 - thanks for updating. So indeed these may be similar issues. I will re-check this and udpate