Hey @<1639799308809146368:profile|TritePigeon86> , given that you want to retry on connection error, wouldn't it be easier to use retry_on_failure
from PipelineController
/ PipelineDecorator.pipeline
None ?
@<1523701070390366208:profile|CostlyOstrich36> I want the task to be queued and the pipeline to act like it's just a queued task and not fail
Hi @<1639799308809146368:profile|TritePigeon86> , can you please elaborate? What do you mean by external way?
maybe relaunch is not the proper solution, but I'm not sure what is, so I'm open to suggestions
this is a protected property and therefore should not be called from outside (meaning it's not good practice to do my_pipeline_controller._relaunch_node(failed_node)
I want to create a status_change_callback that checks if node failed due to connection loss, and if so re-adds the task to the queue
my current code looks like this:
def retry_on_connection_error(pipeline: PipelineController, node: PipelineController.Node, *_, **__) -> None:
if not (node.job is None):
is_stopped = node.job.is_stopped(aborted_nonresponsive_as_running=True)
if is_stopped:
is_connection_lost = node.job.is_stopped(aborted_nonresponsive_as_running=False)
if is_connection_lost:
LOGGER.warning(f"Node {node.name} lost connection with worker {node.job.worker,name}")
pipeline._relaunch_node(node)
else:
LOGGER.info(f"Node {node.name} is stopped")