clearml==1.9.1
clearml-agent==1.5.2
I am not self hosting the server, using the one provided by clearml side
It ran, thanks.. but that original problem persisits. Pipeline is running once all the tasks completed.
Can you please attach the code for the pipeline?
Ignore default, I am trying to run with another docker, but it is also stuck as same
Hi @<1585078763312386048:profile|ArrogantButterfly10> , does the controller stay indefinitely in the running state?
@<1537605940121964544:profile|EnthusiasticShrimp49> is this a code issue or some bug?
What version of clearml
, clearml-agent
& server are you using?
from clearml import Task
from clearml.automation import PipelineController
pipe = PipelineController(name='PIPE_TEST_3',project='PIPE_TEST_3',version="0.0.1",add_pipeline_tags=False)
pipe.add_parameter("url",
"
None ",
"dataset_url"
)
pipe.set_default_execution_queue('services')
pipe.add_step(name="stage_data",
base_task_project="PIPE_TEST_3",
base_task_name="Pipeline step 1 dataset artifact",
parameter_override={"General/dataset_url": "${pipeline.url}"})
pipe.add_step(
name="stage_process",
parents=["stage_data"],
base_task_project="PIPE_TEST_3",
base_task_name="Pipeline step 2 process dataset",
parameter_override={
"General/dataset_url": "${stage_data.artifacts.dataset.url}",
"General/test_size": 0.25,
}
)
pipe.add_step(
name="stage_train",
parents=["stage_process"],
base_task_project="PIPE_TEST_3",
base_task_name="Pipeline step 3 process dataset",
parameter_override={"General/dataset_task_id": "${stage_process.id}"},
)
# pipe.start_locally()
pipe.start(queue='services')
Hey @<1537605940121964544:profile|EnthusiasticShrimp49> I updated clearml but now the issue is my pipeline is stuck here.
Previously it was working fine till the above mentioned issue and I made no change except the mentioned.
And how many agents do you have listening on the “services“ queue?
Can you update the clearml version to latest (1.11.1) and see whether the issue is fixed?
I see you want to use the services
queue for both the pipeline controller and pipeline steps, but you have only one worker/agent listening to this queue. In this case you need at least 2 agents listening to the services queue. Try spawning an additional agent that listens to this queue and let me know how it goes .