Hi @<1639799308809146368:profile|TritePigeon86> , can you please elaborate? What do you mean by external way?
this is a protected property and therefore should not be called from outside (meaning it's not good practice to do my_pipeline_controller._relaunch_node(failed_node)
I want to create a status_change_callback that checks if node failed due to connection loss, and if so re-adds the task to the queue
my current code looks like this:
def retry_on_connection_error(pipeline: PipelineController, node: PipelineController.Node, *_, **__) -> None:
if not (node.job is None):
is_stopped = node.job.is_stopped(aborted_nonresponsive_as_running=True)
if is_stopped:
is_connection_lost = node.job.is_stopped(aborted_nonresponsive_as_running=False)
if is_connection_lost:
LOGGER.warning(f"Node {node.name} lost connection with worker {node.job.worker,name}")
pipeline._relaunch_node(node)
else:
LOGGER.info(f"Node {node.name} is stopped")
@<1523701070390366208:profile|CostlyOstrich36> I want the task to be queued and the pipeline to act like it's just a queued task and not fail
maybe relaunch is not the proper solution, but I'm not sure what is, so I'm open to suggestions
Hey @<1639799308809146368:profile|TritePigeon86> , given that you want to retry on connection error, wouldn't it be easier to use retry_on_failure
from PipelineController
/ PipelineDecorator.pipeline
None ?