Yup! Have two queues:
services with one worker spun up in
--services-mode , and another queue (say
foo ) that has a bunch of GPU workers on them. When I start the pipeline locally, jobs get sent off to
foo and executed exactly how I'd expect. If I keep everything exactly the same, and just change
pipeline.start() , the pipeline task itself is picked up by the worker in the
services queue, sets up the venv correctly, prints
Starting Task Execution: then does nothing 😕
Yup, code/git reference is there. Will private message you the log
It just seems frozen at the place where it should be spinning up the tasks within the pipeline
And is there an agent for those ? usually there is one agent for running logic tasks (like pipelines) running with
--services-mode which means multiple Tasks can be executed by the same agent. And other agents for compute Tasks that are a signle Task per agent (but you can run multiple agents on the same machine)
However, it seems to be entirely hanging here in the "Running" state.
Did you set a an agent to listen to the "services" queue ?
Someone needs to run the pipeline logic itself, it is sometimes part of the clearml-server deployment but not a mist
Yup, there was an agent listening to the services queue, it picked up the pipeline job and started to execute it. It just seems frozen at the place where it should be spinning up the tasks within the pipeline
sets up the venv correctly, prints
Starting Task Execution:
then does nothing
Can you provide a log?
Do you see the code/git reference in the Pipeline Task details - Execution Tab ?