Yup, there was an agent listening to the services queue, it picked up the pipeline job and started to execute it. It just seems frozen at the place where it should be spinning up the tasks within the pipeline
Hi SteadySeagull18
However, it seems to be entirely hanging here in the "Running" state.
Did you set a an agent to listen to the "services" queue ?
Someone needs to run the pipeline logic itself, it is sometimes part of the clearml-server deployment but not a mist
It just seems frozen at the place where it should be spinning up the tasks within the pipeline
And is there an agent for those ? usually there is one agent for running logic tasks (like pipelines) running with --services-mode
which means multiple Tasks can be executed by the same agent. And other agents for compute Tasks that are a signle Task per agent (but you can run multiple agents on the same machine)
Yup! Have two queues: services
with one worker spun up in --services-mode
, and another queue (say foo
) that has a bunch of GPU workers on them. When I start the pipeline locally, jobs get sent off to foo
and executed exactly how I'd expect. If I keep everything exactly the same, and just change pipeline.start_locally()
-> pipeline.start()
, the pipeline task itself is picked up by the worker in the services
queue, sets up the venv correctly, prints Starting Task Execution:
then does nothing 😕
sets up the venv correctly, prints
Starting Task Execution:
then does nothing
Can you provide a log?
Do you see the code/git reference in the Pipeline Task details - Execution Tab ?
Yup, code/git reference is there. Will private message you the log