It might be the task was stopped/reset from the UI?
Is it possible the machines are running out of memory? Do you get this error on the pipeline controller itself? Does this constantly reproduce?
if I use start_locally instead of start this issue doesn't occur
@<1559711623147425792:profile|PlainPelican41> status reason 3 means the task status was changed mid-run
not a memo issue.. I also tried to switch the queue to another new machine with new clearml-agent installation
But I get the same result ..
Can you paste here the code of the pipeline that you're trying to run?
on a nother machine with the relevant queue
Hi @<1559711623147425792:profile|PlainPelican41> , How are you running the pipeline? Where is agent running ?