
Reputation
Badges 1
18 × Eureka!Task completes normally. I'm using clearml's aws autoscaler.
Task is started the following way: airflow job run finds an older task, clones it, changes some params and enqueues it.
Is there any way to make sure that task would fail?
but what's the best way to catch the exception? All high-level clearml function calls return normally
my code:
dataset = Dataset.create(
dataset_project=PROJECT_NAME,
dataset_name=f"processed_{mode}",
dataset_tags=task.get_tags(),
parent_datasets=None,
use_current_task=False,
output_uri=BUCKET,
)
dataset.add_files(path, verbose=True)
dataset.upload(verbose=True) dataset.finalize(verbose=True)
it seems that connecting it as config is more convenient than uploading an artifact, because artifacts are deleted when cloning a task. Code is very simple:
That sounds like an interesting hack 😃 I'll try it out, thanks!
common_module = task.connect_configuration("../common.py", "common.py")
if not task.running_locally():
import shutil
shutil.copy(common_module, "common.py")
from common import test_common
test_common()
I also cannot create a package out of common code, because the package registry is inside the internal network as well
Tried it on 1.13.1. Same problem. @<1523701087100473344:profile|SuccessfulKoala55> any advice?
So you'd recommend setting use_current_task=False
when creating the dataset in this task or should this be done somehow differently?
from clearml import Task, Dataset
task = Task.init(
project_name="MyProject",
task_name="MyTask",
task_type=Task.TaskTypes.data_processing,
reuse_last_task_id=False,
output_uri="
"
)
with open("new_file.txt", "w") as file:
file.write("Hello, world!")
dataset = Dataset.create(parent_datasets=None, use_current_task=True)
dataset.add_files(".", wildcard="new_file.txt", verbose=True)
dataset.upload(verbose=True)
dataset.finalize(verbose=True)
I did similarly at my previous work (we had open source clearml deployed). The problem I described here was not present there. I liked this approach. It was convenient that dataset_id and task_id are the same.
Any updates? Should I provide any extra context?