The easiest is to configure it as default output_uri in the clearml.conf of file the agent, wdyt?
Ahh that’s great, thank you.
And then I could use storage manager or whatever to get the files. Perfect
It also seems that
is not compatible with caching, sadly,
Both use the exact same mechanism of uploading artifacts (i.e. including caching for downloaded artifacts), in terms of caching pipeline components, this is on a component level (i.e. same code/task same arguments, equals cache hit)
What exactly are you getting ? how is it that the "PipelineDecorator.upload_artifact" uploads to a different storage ? is that reproducible ?
The return objects were stored to S3 but
PipelineDecorator.upload_artifact still uploaded to the file server. Not sure what was up with that but as explained in my next comment it did work when I tried again.
It also seems that
PipelineDecorator.upload_artifact is not compatible with caching, sadly, but that is another issue for another thread that I will be starting on Monday.
Have a good weekend
I have added a lot of detail to this, sorry.
The inline comments in the code talk about that specific script/implementation.
I have added a lot of context in the doc string at the top.
Hmm. Okay. Thanks
So the way it works when you run a component the return value with the entire function execution is cached, basically:
this did NOT add the artifact to the pipeline via caching on subsequent runs ❌
you just need to do:
PipelineDecorator.upload_artifact(name='images', artifact_object=img_dir, wait_on_upload=True) return Task.current_task().artifacts['images'].url
This will return the URL of the uploaded images (i.e. S3 bucket)
which means if this is cached you will get it
image_bucket = gen_random_images() second_step(image_bucket)
you can always get the currently executed Task (of any part of the pipeline) with Task.current_task() no need to call "pipe._get_pipeline_task()"