Hi @<1585078763312386048:profile|ArrogantButterfly10> , does the controller stay indefinitely in the running state?
What version of clearml
, clearml-agent
& server are you using?
clearml==1.9.1
clearml-agent==1.5.2
I am not self hosting the server, using the one provided by clearml side
Can you update the clearml version to latest (1.11.1) and see whether the issue is fixed?
Hey @<1537605940121964544:profile|EnthusiasticShrimp49> I updated clearml but now the issue is my pipeline is stuck here.
Previously it was working fine till the above mentioned issue and I made no change except the mentioned.
Can you please attach the code for the pipeline?
from clearml import Task
from clearml.automation import PipelineController
pipe = PipelineController(name='PIPE_TEST_3',project='PIPE_TEST_3',version="0.0.1",add_pipeline_tags=False)
pipe.add_parameter("url",
"
None ",
"dataset_url"
)
pipe.set_default_execution_queue('services')
pipe.add_step(name="stage_data",
base_task_project="PIPE_TEST_3",
base_task_name="Pipeline step 1 dataset artifact",
parameter_override={"General/dataset_url": "${pipeline.url}"})
pipe.add_step(
name="stage_process",
parents=["stage_data"],
base_task_project="PIPE_TEST_3",
base_task_name="Pipeline step 2 process dataset",
parameter_override={
"General/dataset_url": "${stage_data.artifacts.dataset.url}",
"General/test_size": 0.25,
}
)
pipe.add_step(
name="stage_train",
parents=["stage_process"],
base_task_project="PIPE_TEST_3",
base_task_name="Pipeline step 3 process dataset",
parameter_override={"General/dataset_task_id": "${stage_process.id}"},
)
# pipe.start_locally()
pipe.start(queue='services')
And how many agents do you have listening on the “services“ queue?
Ignore default, I am trying to run with another docker, but it is also stuck as same
@<1537605940121964544:profile|EnthusiasticShrimp49> is this a code issue or some bug?
I see you want to use the services
queue for both the pipeline controller and pipeline steps, but you have only one worker/agent listening to this queue. In this case you need at least 2 agents listening to the services queue. Try spawning an additional agent that listens to this queue and let me know how it goes .
It ran, thanks.. but that original problem persisits. Pipeline is running once all the tasks completed.