But maybe another solution would be to pass the configuration files paths as function arguments, then read and parse them inside the pipeline
Yes, when the parameters that are connected do not have nested dictionaries, everything works fine. The problem comes when I try to do something like this:
` from clearml import Task
task = Task.init(project_name="Examples", task_name="task with connected dict")
args = {}
args["period"] = {"start": "2020-01-01 00:00", "end": "2020-12-31 23:00"}
task.connect(args) `
and the clone task is like this:
` from clearml import Task
template_task = Task.get_task(task_id="<Your template task id>"...
Thanks to you for fixing it so quickly!
That's right! run_locally() does just what I was expecting
Well, I am thinking in the case that there are several pipelines in the system and that when filtering a task by its name and project I can get several tasks. How could I build a filter for Task.get_task(task_filter=...)
that returns only the task whose parent task is the pipeline task?
Anyway, is there any way to retrieve the information stored in the RESULTS tab of ClearML Web UI?
Great AgitatedDove14 , I tested it on the mock example and it worked just as I expected 🎉
Sure, here is a trivial example:from clearml import Dataset dataset = Dataset.create(dataset_name="Dataset_v1.1.3", dataset_project="Mocks") dataset.finalize() loaded_dataset = Dataset.get(dataset_id=dataset.id)
By the way, where can I change the default artifacts location ( output_uri
) if a have a script similar to this example (I mean, from the code, not agent's config):
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py
Hey AgitatedDove14 ! Any news on this? 🙂
Okay! I'll keep an eye out for updates.
Mmm but what if the dataset size is too large to be stored in the .cache path? It will be stored there anyway?
I mean the agent that will run the function (which represents a pipeline step) should clone the repo in order to find the location of the project modules that are required for the function to be executed. Also, I have found that clearml
does not automatically detect the imports specified within the function decorated with PipelineDecorator.component
(despite I followed a similar scheme to the one in the example https://github.com/allegroai/clearml/blob/master/examples/pipeline/pi...
Hi Martin,
Actually Task.add_requirements
behaves as I expect, since that part of the code is in the preprocessing script and for that task it does install all the specified packages. So, my question could be rephrased as the following: when working with PipelineController
, is there any way to avoid creating a new development environment for each step of the pipeline?
According to the https://clear.ml/docs/latest/docs/clearml_agent provided in the official ClearML documentatio...
Sure! That definitely makes sense. Where can I specify callbacks in the PipelineDecorator
API?
I mean what should I write in a script to import the APIClient? (sorry if I'm not explaining myself properly 😅 )
To sum up, we agree that it will be nice to enable the nested components tags. I will continue playing with the capabilities of nested components and keep reporting bugs as I come across them!
The thing is I don't know in advance how many models there will be in the inference stage. My approach is to read from a database the configurations of the operational models through a for loop, and in that loop all the inference tasks would be enqueued (one task for each deployed model). For this I need the system to be able to run several pipelines at the same time. As you told me for now this is not possible, as pipelines are based on singletons, my alternative is to use components
Hi AgitatedDove14 , great, glad it was fixed quickly!
By the way, before releasing version 1.1.3 you might want to take a look at this mock example. I'm trying to run the same pipeline (with different configurations) in a single for loop, as you can see below:
` from clearml import Task
from clearml.automation.controller import PipelineDecorator
@PipelineDecorator.component(return_values=["msg"], execution_queue="myqueue1")
def step_1(msg: str):
msg += "\nI've survived step 1!"
re...
They share the same code (i.e. the same decorated functions), but using a different configuration.
Sure, converting pipelines into components also works for me (ignoring still having to fix the problem with LazyEvalWrapper
return values). But this way some interesting features of the pipeline are missing, such as displaying the step execution DaG in the PLOTS tab .
Yes, I like it! I was already get used to the ' execute_steps_as_functions' argument of PipelineDecorator.debug_pipeline()
but I find your proposal to be more intuitive.
Great, thank you very much for the info! I just spotted the get_logger
classmethod. As for the initial question, that's just the behavior I expected!
AgitatedDove14 I ended up with two pipelines being executed until they completed the workflow but duplicating each of their steps. You can check it here:
https://clearml.slack.com/files/U02A5DGPMPU/F02SR3G9RDK/image.png
Where can I find this documentation?
Mmm what would be the implications of not being part of the DAG? I mean, how could that step be launched if it is not part of the execution graph?
Hi AgitatedDove14 ,
Any updates on the new ClearML release that fixes the bugs we mentioned in this thread? :)
AgitatedDove14 After checking, I discovered that apparently it doesn't matter if each pipeline is executed by a different worker, the error persists. Honestly this has me puzzled. I'm really looking forward to getting this functionality right because it's an aspect that would make ClearML shine even more.