No, it was fixed by restarting clearml then and some services. But currently, we gave up and we use debug=True so we dont use the services queue
Reviving this: do you recall what fixed this, or has anyone else run into this issue? I'm constantly getting this in my pipelines. If I run the exact same pipeline code / configuration multiple times, it will eventually run without a User aborted: stopping task (3)
, but it's unclear what is happening the times when it fails.
is the agent execution dependent on some CMD in my docker file?