As I understand, providing this param at the Task.init() inside the subtask is too late, because step is already started.
If you are running the task on an agent (with I assume you do), than one way would be to configure the "default_output_uri" on the agnets clearml.conf file.
The other option is to change the task as creation time, task.storage_uri = 's3://...'
I discovered that task.set_base_docker() is allowed only locally.
In task.py:
I wanted to do following:task = Task.init(
project_name=..., task_name=...,
task_type=Task.TaskTypes.controller) # base pipeline task
after that, I wanted to create steps from scratch, because I have many steps and I hope to avoid manual editing in GUI (commits and other things). I create this tasks:new_task = Task.create(...)
and finally I added it to pipe:pipe.add_step(...)
I have problem with some execution properties, like docker and output_uri. I've successfully provided commit, branch and other params from base pipeline task to step-tasks, but I've done it not very legally:Task.create(...,
commit = task._data._property_script._property_version_num,
branch = task._data._property_script._property_branch,
...)
Hi ApprehensiveFox95 ,
Can you try
task = Task.create(...) task.set_base_docker("docker command")
?
Hi ApprehensiveFox95
I think this is what you are looking for:step1 = Task.create( project_name='examples', task_name='pipeline step 1 dataset artifact', repo='
` ',
working_directory='examples/pipeline',
script='step1_dataset_artifact.py',
docker='nvcr.io/nvidia/pytorch:20.11-py3'
).id
step2 = Task.create(
project_name='examples', task_name='pipeline step 2 process dataset',
repo=' ',
working_directory='examples/pipeline',
script='step2_data_processing.py',
docker='nvcr.io/nvidia/pytorch:20.11-py3'
).id
step3 = Task.create(
project_name='examples', task_name='pipeline step 3 train model',
repo=' ',
working_directory='examples/pipeline',
script='step3_train_model.py',
docker='nvcr.io/nvidia/pytorch:20.11-py3'
).id
Connecting ClearML with the current process,
from here on everything is logged automatically
task = Task.init(project_name='examples', task_name='pipeline demo',
task_type=Task.TaskTypes.controller, reuse_last_task_id=False)
pipe = PipelineController(default_execution_queue='default', add_pipeline_tags=False)
pipe.add_step(name='stage_data', base_task_project='examples', base_task_id=step1,
clone_base_task=False)
pipe.add_step(name='stage_process', parents=['stage_data', ],
base_task_project='examples', base_task_id=step2,
clone_base_task=False,
parameter_override={'General/dataset_url': '${stage_data.artifacts.dataset.url}',
'General/test_size': 0.25})
pipe.add_step(name='stage_train', parents=['stage_process', ],
base_task_project='examples', base_task_id=step3,
clone_base_task=False,
parameter_override={'General/dataset_task_id': '${stage_process.id}'}) You might need the latest clearml:
pip install git+ `
Also I have a question about parameter output_uri:
Can I provide this parameter to subtask in Time.create() or after that?
As I understand, providing this param at the Task.init() inside the subtask is too late, because step is already started.
From the ClearML UI you can just change the value under BASE DOCKER IMAGE section to your image
AgitatedDove14 I use exactly this version
This is definitely a but, in the super class it should have the same condition (the issue is checking if you are trying to change the "main" task)
Thanks ApprehensiveFox95
I'll make sure we push a fix 🙂
Maybe I missed something, whats your flow? Do you have some kind of “template task”? And you clone it?
TimelyPenguin76 Thanks, it helped me locally, but it doesn't work when I start pipeline task from GUI
after that, I wanted to create steps from scratch, because I have many steps and I hope to avoid manual editing in GUI (commits and other things). I create this tasks:
You can add this to the template task Task.init(project_name=<your project name>, task_name=<your task name>)
instead of the Task.create
call and it will have all the inputs for you.
After, add task.set_base_docker("docker command")
and it will configure the docker for the task.
Once finish configuring the task, add task.execute_remotely()
and it wont actually run the task but only register it in the ClearML UI - and you have a template task ready for use (just run it once from your local machine for the registration of the task).
Draft created successfully, but it doesn't contain property with docker command.
Could you help me?
ApprehensiveFox95 could you test with the latest RC, I think there was a fixpip install clearml==0.17.5rc5
Yes, but I wanted to create task automatically and after that add this task to pipeline for running. I hoped to avoid additional edits in GUI
Apparently I don't understand something.
I tried using Task.init() instead of Task.create(), but I gotclearml.errors.UsageError: Current task already created and requested task name 'exp_1.0.31_Main' does not match current task name 'exp_1.0.31'. If you wish to create additional tasks use
Task.create
because I wanted to initialize not existing subtask with new unique task_name. If I clone subtask instead of creating new every time, then as I understand, I don't have any opportunities to change commit version and other execution params. Is it right?