Answered
Hi Folks, Occasionally When I Clone A Job And Enqueue It, Instead Of Being Processed By The Expected Queue, A New Queue (With Some Id That Looks Like An Hash) Is Created Instead, And The Experiment Hangs In A "Pending" State.
When This Happens, If I Abor
Hi folks, occasionally when I clone a job and enqueue it, instead of being processed by the expected queue, a new queue (with some id that looks like an hash) is created instead, and the experiment hangs in a "Pending" state.
When this happens, if I Abort the task, reset it and re-enqueue it, often things work. I couldn't properly understand when this happens, but I was wondering if any of you had the same experience?
I am using a self-hosted version of ClearML and the agents are spawned with the K8s Agent Glue helm chart.
Show more results
replies
2K Views
31
Answers
one year ago
26 days ago
Tags