Reputation
Badges 1
27 × Eureka!I also created an issue in the repo directly. Thx for your help.
python3 -m clearml_agent --config-file clearml.conf daemon --foreground --queue services --service --docker --cpu-only
Neat - looks like exactly what I looking for thxx
And If I create myself a Pro account - can I somehow piggyback on the existing UIs to display the state of the Autoscaler Task?
No just just the clearml-agent
Ok - I customized it a bit to our workflow โ so I wanted to keep our โforkโ of the autoscaler but I guess this is not supported.
I am running clearml-agent 1.6.1
Neat - it works ! Thanks for the quick response ๐
root@clement-controller-1:~# head clearml.conf
agent {
default_docker {
arguments: ["-v","/var/run/docker.sock:/var/run/docker.sock"]
}}
So I can confirm I have the same behavior with this minomal example
#!/usr/bin/env python3
import fire
from typing import Optional
import time
from clearml import PipelineController
def step_one(a=1):
print("Step 1")
time.sleep(120)
return True
def step_two(a=1):
print("Step 2")
time.sleep(120)
return True
def launch():
pipe = PipelineController(
project="TEST",
name="Pipeline demo",
version="1.1",
add_pipeline_tags=False,
...
Hey, finally got to try it, sorry about the delay.
However, I tried on 1.14.1 but i still get the same behavior
Thx working now on 1.14.2 ๐
Yep - sounds perfect ๐
- Itโs a pipeline from Tasks.
- clearml==1.13.2
- For instance, in this pipeline, if the first task failed - then the remaining task are not schedule for execution which is what I expect. I am just surprised that if the first task is aborted instead by the user, the following task is still schedule for execution (and will fail cause itโs dependant on the first one to complete).
Yes, I agree, it should be considered as failed and the PipelineController should not trigger the following task which depends on the first one. My problem is that itโs not the behavior I observe, the second task still get scheduled for execution. Is there a way to specify that to the PipelineController logic ?
Ok - good to know this is odd ๐
Itโs created like this (I remove some bits for readability)
def _run(pipeline_id, step):
from pipeline_broker import pipeline
pipeline.run_step(pipeline_id=pipeline_id, step=step)
def launch(
cfg,
queue: str = "default",
abort_on_failure: bool = False,
project: str = "TrainingPipeline",
start_locally: bool = False,
task_regex: str = ".*",
):
...
pipe = PipelineController(
project=project,
name...
And same behavior if I make the dependance explicty via the retunr of the first one
#!/usr/bin/env python3
import fire
from typing import Optional
import time
from clearml import PipelineController
def step_one(a=1):
import time
print("Step 1")
time.sleep(120)
return True
def step_two(a=1):
import time
print("Step 2")
time.sleep(120)
return True
def launch(
tenant: str = "demo",
loc_id: str = "common",
tag: str = "test",
pipeline_id: Optio...
Hi, any chance you got some time to look if you could replicate on your side ?
Step 1 was aborted, but the second still was scheduled