Reputation
Badges 1
Eureka!Was "task.close()" called for the early-stopped task?
What is the experiment status in Web UI?
Hi @<1523701070390366208:profile|CostlyOstrich36> ! Thanks for responding!
In the apiserver.log there are periodic (~1 min) messages like this:[9] [WARNING] [clearml.service_repo] Returned 401 for auth.login in 2ms, msg=Unauthorized (invalid credentials) (failed to locate provided credentials)
In the fileserver.log there are no fresh messages - probably because the container could not start normally.
@<1664079296102141952:profile|DangerousStarfish38> Yep, you are right, according to the docs, the optimizer.stop() should be used, not task.close(). Sorry for confusing.
I guess the issue is in connectivity/auth problems between ClearML components - there are many timeout messages in the log. I have similar messages for fileserver container, not yet resolved.