Reputation
Badges 1
183 × Eureka!Oddly enough I didn't run into this problem today 🤔 If it happens to me again, I'll return to this thread 🙂
They share the same code (i.e. the same decorated functions), but using a different configuration.
So, for example, if I have a machine with 64 CPU cores, I will be able to run up to 64 agents (in case my system consists of only this machine), right?
Mmm I see. So the agent is taking the parameters from the base task registered in the server. Then if I call task.get_parameter_as_dict for a task that has not been executed by an agent, should I get the original types of the values?
So ClearML will scan all the repository code searching for package dependencies? Is that right?
Okey, so I could signal to the main pipeline the exception raised in any of the pipeline components and it should halt the whole pipeline. However, are you thinking of including this callbacks features in the new pipelines as well?
The thing is I don't know in advance how many models there will be in the inference stage. My approach is to read from a database the configurations of the operational models through a for loop, and in that loop all the inference tasks would be enqueued (one task for each deployed model). For this I need the system to be able to run several pipelines at the same time. As you told me for now this is not possible, as pipelines are based on singletons, my alternative is to use components
Hi AgitatedDove14 , gotcha. So how can I temporarily fix it? I'm not able to find something like task.set_output_uri() in the official docs. Or maybe do you plan to solve this problem in the very short term?
Thanks, I'd appreciate it if you let me know when it's fixed :D
AgitatedDove14 In the 'status.json' file I could see the 'is_dirty' flag is set to True
Indeed it does! But what still puzzles me so badly is why I get below path when running dataset.get_local_copy() on one of the machines of my cluster:/home/user/.clearml/cache/storage_manager/datasets/.lock.000.ds_61ff8d4335dd4b74bd78c3576fa44131.clearml
Why is it pointing to a .lock file?
I can't figure out what might be going on
Thanks to you for fixing it so quickly!
The scheme is similar to the following:
` main_pipeline
(PipelineDecorator.pipeline)
|
|----------------------------------|
| |
inference_orchestrator_1 inference_orchestrator_2
(PipelineDecorator.component, (PipelineDecorator.component,
acting as a pipeline) acting as a pipeline)
| |
...
Well, just as you can pass the 'task_type' argument in PipelineDecorator.component , it might be a good option to pass the rest of the 'Task.init' arguments as they are passed in the original method (without using a dictionary)
Or perhaps the complementary scenario with a continue_on_failed_steps parameter which may be a list containing only the steps that can be ignored in case of failure.
But I cannot go back to version v1.1.3 because there is another bug related to the Dataset tags
So great! It would be a feature that would make the work much easier instead of having to clone the task and launch it with different parameters. It could even be considered more pythonic. Do you have an immediate solution in mind to keep moving forward before the new release is ready? :)
My guess is to manually read and parse the string that clearml-agent list returns, but I'm pretty sure there's a cleaner way to do it, isn't there?
Nice, in the meantime as a workaround I will implement a temporary parsing code at the beginning of step functions
Yep, you were absolutely right. What Dask did not like was the object self.preprocesser inside read_and_process_file , not Task.init . Since the dask.distributed.Client is initialized in that same class, maybe it's something that Dask doesn't allow.
Sorry for blaming ClearML without solid evidence x)
Hi! Not really. It's rather random :/
Please let me know as soon as you have something :)
But I was actually asking about accessing the Pipeline task ID, not the tasks corresponding to the components.
Oh, I see. In the meantime I will duplicate the function and rename it so I can work with a different configuration. I really appreciate your effort as well as having a continuous feedback to keep improving this wonderful library!