weird thing is it didn't realize it was dead until a task was trying to run, it looked like it was still up even after the reboot
so the machine reboots? obviously if the machine reboots, the agent will die 🙂
Well, how are you running it? it should stay up and monitor the queue... Can you share the logs?
ill have to wait until it drops again and see
clearml-agent daemon --queue default --detached
ok so what seems to happen is that when the machine reboots, after it comes back up the agent is seemingly still working. clearml-agent list shoes the agent and i see it in the web ui. since the /tmp folder is deleted after reboot the log file is gone. when i try to run something on the agent only then the list is empty
Well, the agent stores its logs in your temp folder (when running it, it prompts and specifies where these are stored) - I suggest getting the logs as they might provide a clue to whats going on