Hi WobblyFrog79 , you mean when running the agent over K8s?
Hey CostlyOstrich36 , could you provide any suggestions here, please?
Hi WobblyFrog79 - Please try setting the environment variable CLEARML_K8S_GLUE_DEBUG=1 on the Agent
agentk8sglue:
extraEnvs:
- name: CLEARML_K8S_GLUE_DEBUG
value: "1"
This will make the Agent Pod print the rendered Task Pod template in the logs, so you can see it 🙂
Awesome CooperativeKitten94 , will definitely add that. It would also be very helpful if there was a way to delay deleting "completed/failed" pods. This is useful when something fails unexpectedly and ClearML logs are not enough to debug the issue. Does that make sense to you? I could contribute to your codebase if you're interested.
Wonderful - We do not have such feature planned for now, feel free to contribute 🙂