Reputation
Badges 1
17 × Eureka!my code:
dataset = Dataset.create(
dataset_project=PROJECT_NAME,
dataset_name=f"processed_{mode}",
dataset_tags=task.get_tags(),
parent_datasets=None,
use_current_task=False,
output_uri=BUCKET,
)
dataset.add_files(path, verbose=True)
dataset.upload(verbose=True) dataset.finalize(verbose=True)
but what's the best way to catch the exception? All high-level clearml function calls return normally
Is there any way to make sure that task would fail?
from clearml import Task, Dataset
task = Task.init(
project_name="MyProject",
task_name="MyTask",
task_type=Task.TaskTypes.data_processing,
reuse_last_task_id=False,
output_uri="
"
)
with open("new_file.txt", "w") as file:
file.write("Hello, world!")
dataset = Dataset.create(parent_datasets=None, use_current_task=True)
dataset.add_files(".", wildcard="new_file.txt", verbose=True)
dataset.upload(verbose=True)
dataset.finalize(verbose=True)
So you'd recommend setting use_current_task=False
when creating the dataset in this task or should this be done somehow differently?
Any updates? Should I provide any extra context?
I did similarly at my previous work (we had open source clearml deployed). The problem I described here was not present there. I liked this approach. It was convenient that dataset_id and task_id are the same.
Tried it on 1.13.1. Same problem. @<1523701087100473344:profile|SuccessfulKoala55> any advice?
common_module = task.connect_configuration("../common.py", "common.py")
if not task.running_locally():
import shutil
shutil.copy(common_module, "common.py")
from common import test_common
test_common()
I also cannot create a package out of common code, because the package registry is inside the internal network as well
That sounds like an interesting hack 😃 I'll try it out, thanks!
it seems that connecting it as config is more convenient than uploading an artifact, because artifacts are deleted when cloning a task. Code is very simple:
Task completes normally. I'm using clearml's aws autoscaler.
Task is started the following way: airflow job run finds an older task, clones it, changes some params and enqueues it.