
Reputation
Badges 1
27 × Eureka!I also created an issue in the repo directly. Thx for your help.
python3 -m clearml_agent --config-file clearml.conf daemon --foreground --queue services --service --docker --cpu-only
root@clement-controller-1:~# head clearml.conf
agent {
default_docker {
arguments: ["-v","/var/run/docker.sock:/var/run/docker.sock"]
}}
Neat - looks like exactly what I looking for thxx
I am running clearml-agent 1.6.1
So I can confirm I have the same behavior with this minomal example
#!/usr/bin/env python3
import fire
from typing import Optional
import time
from clearml import PipelineController
def step_one(a=1):
print("Step 1")
time.sleep(120)
return True
def step_two(a=1):
print("Step 2")
time.sleep(120)
return True
def launch():
pipe = PipelineController(
project="TEST",
name="Pipeline demo",
version="1.1",
add_pipeline_tags=False,
...
Hi, any chance you got some time to look if you could replicate on your side ?
And If I create myself a Pro account - can I somehow piggyback on the existing UIs to display the state of the Autoscaler Task?
Ok - I customized it a bit to our workflow โ so I wanted to keep our โforkโ of the autoscaler but I guess this is not supported.
Neat - it works ! Thanks for the quick response ๐
Yep - sounds perfect ๐
And same behavior if I make the dependance explicty via the retunr of the first one
#!/usr/bin/env python3
import fire
from typing import Optional
import time
from clearml import PipelineController
def step_one(a=1):
import time
print("Step 1")
time.sleep(120)
return True
def step_two(a=1):
import time
print("Step 2")
time.sleep(120)
return True
def launch(
tenant: str = "demo",
loc_id: str = "common",
tag: str = "test",
pipeline_id: Optio...
Step 1 was aborted, but the second still was scheduled
Yes, I agree, it should be considered as failed and the PipelineController should not trigger the following task which depends on the first one. My problem is that itโs not the behavior I observe, the second task still get scheduled for execution. Is there a way to specify that to the PipelineController logic ?
- Itโs a pipeline from Tasks.
- clearml==1.13.2
- For instance, in this pipeline, if the first task failed - then the remaining task are not schedule for execution which is what I expect. I am just surprised that if the first task is aborted instead by the user, the following task is still schedule for execution (and will fail cause itโs dependant on the first one to complete).
Ok - good to know this is odd ๐
Itโs created like this (I remove some bits for readability)
def _run(pipeline_id, step):
from pipeline_broker import pipeline
pipeline.run_step(pipeline_id=pipeline_id, step=step)
def launch(
cfg,
queue: str = "default",
abort_on_failure: bool = False,
project: str = "TrainingPipeline",
start_locally: bool = False,
task_regex: str = ".*",
):
...
pipe = PipelineController(
project=project,
name...
Thx working now on 1.14.2 ๐
Hey, finally got to try it, sorry about the delay.
However, I tried on 1.14.1 but i still get the same behavior