We just do task.close() and then start a new task.Init() manually, so our "pipelines" are self-controlled
That's why I want to keep it as separate tasks under a single pipeline.
Hmm Yes, if this is the case then you definitely have to have two Tasks (with execution info on each one).
So you could just create a "draft" pipeline Task and report everything to it? Does that make sense ?
(By design a pipeline is in charge of spinning the Tasks and pulling the data/metric from them if needed, in your case it sounds like you need the Tasks to push the data/metric onto the pipeline Task, this is actually doable).
So the flow can be:
Create pipeline Task (draft) Pass the pipeline Task ID to the "steps" Have the steps report to the "pipeline" TaskDoes that make sense ?
Pseudo-ish code:
create pipelinepipeline = Task.create(..., task_type="controller") pipeline.mark_started() print(pipeline.id)
2. launch step A (pass arguments via command line argument / os environment)
` task = Task.init(...)
pipeline_id = os.environ['MY_MAIN_PIPELINE']
pipeline_task = Task.get_task(task_id=pipeline_id)
send some metrics / reports etc.
pipeline_task.get_logger().report_scalar(...)
pipeline_task.get_logger().report_text(...) `wdyt? (obvioudly you need to somehow pass the pipeline task id to the steps, I'm not sure I understand how you actually launch these steps, but I'm assuming this is doable)
BTW: why not just use clearml-agent for launching the steps ?
Hi AgitatedDove14 .
That way I loose some execution information, only the execution information from last Task stays logged. That's why I want to keep it as separate tasks under a single pipeline.
Hi ScaryLeopard77
You can probably do:Task.init(...,continue_last_task='task_id_here')
This will continue a previously executed Task and log both steps in the same place.
Does that help?
BTW: you can also of course manually report to any Task as it is still running with:aux_task = Task.get_task(task_id_here) aux_task.get_logger().report_scalar(...)
ScaryLeopard77 , Hi! Is there a specific reason to the aversion from pipelines? What is the use case?
"continue with this already created pipeline and add the currently run task to it"
I'm not sure I understand, can you please elaborate? (I'm pretty sure it's a pipelines feature)
The idea is that I first call script start_new_pipeline.py
, which should just create the pipeline and then I call scripts train_pipeline.py
and evaluate_pipeline.py
which contain the tasks that should belong to the pipeline. However I don't know how start_new_pipeline.py
should look like so that the following tasks would belong the created pipeline.
I kind of understand the first step -> create the pipeline task, keep it in draft state and save its ID. How do you though pass the ID to the following tasks and have them report to the pipeline (parent) task?