Hi, will proceed to close this thread. We found some issue with the underlying docker in our machines. We've have not shifted to another k8 of ec2 instances in AWS.
When I push a job to an agent node, i got this error.
"Error response from daemon: network None not found"
Hi, sorry for the delayed response. Btw, all the pods are running all good.
This is where I downloaed the log. Seems like some docker issue, though i cant seem to figure it out. As an alternative, I spawned a clearml-agent outside the k8 environment and it was able to execute well.
However, I am able to get it to work, if I launch a clearml-agent outside the kubernetes ecosystem.
It'll be good if there was yaml file to deploy clearml-agents into the k8 system.
Hi DeliciousBluewhale87 ,
You're correct - the k8s templates do not contain a ClearML Agent template (only a template for the Agent Services), and we're just in the process of completely moving to more configurable Helm charts.
The current Helm chart does contain a template for ClearML Agent deployment - can you explain what's missing there in terms of configuration?
Hey DeliciousBluewhale87 ,
It seems like this log is the log of a task that was pulled by the agent running on the clearml-services pod, is this the case? Where did you find the above log?
Also - can you please send us the list of all the running pods in the namespace? I want to make sure the other agents are up.