I have one agent running on the machine. I also have only one task running. This
only
happens to us when we use pipelines
@<1724960468822396928:profile|CumbersomeSealion22> notice that when you are launching a pipeline you are actually running Two tasks, one is the "pipeline" itself (i.e. the logic) and one is the component in the pipeline (i.e. the step)
If you have one agent, I'm assuming what happens is the pipeline itself (the one that you launch on your machine) is stopping and being relaunched on the agent, then it is launching the step itself that is waiting in the same queue to be executed but there is no free agent to pull and execute it.
If you want to test this theory, run the pipeline logic "locally" (i.e. no agent) by doing:
pipe.start_locally(run_pipeline_steps_locally=False)
I have one agent running on the machine. I also have only one task running. This only happens to us when we use pipelines, not single tasks. It does not depend on parameters like cache. There are no other tasks running in the meantime. I can boil it down even to "Hello World" tasks.
Notably, the example given here
also causes the observed behavior.
Hi @<1724960468822396928:profile|CumbersomeSealion22>
It starts the pipeline, logs that the first step is started, and then...does nothing anymore.
How many agents do you have running? by default an agent will run a Task per agent (unless executed with --services-mode which would allow it to run unlimited amount of parallel tasks)
This is true, yes. I do
pipe.set_default_execution_queue("default") and also
pipe.start(queue="default"), where the single steps do not specify queues. Also, my GUI tells me that this is so.
Well, rather, it takes a minute to complete.
@<1724960468822396928:profile|CumbersomeSealion22> in the pipeline definition, I assume you use the same queue to enqueue the controller and the steps?
Yes, you are right, thanks. Now, I am using two agents with one using a queue dedicated only to the pipeline, and one dedicated to the single tasks. It works. However, still, it sometimes takes a strangely long time for the agent to pick up the next task (or process it), even if it is only "Hello World".
It works. However, still, it sometimes takes a strangely long time for the agent to pick up the next task (or process it), even if it is only "Hello World".
The agent check every 2/5 seconds if there is a new Task to be launched, could that be it?
Update:
- It does seem to work somehow sometimes, but it takes an unreasonably long time. Even just printing print("Hello World") takes like a minute or so (after the environment has fully been set up).
- I needed to trigger the pipeline 2 times, the first time not even the pipeline started.
Just noting that it also does not work with two agents listening to the same queue, because I thought maybe the controller task of the pipeline blocks the executing of the actual tasks.