Hi CostlyOstrich36 AgitatedDove14
Oh no, I am not trying to say that I am using each agent to run a single task. I have several agents listening to a number of queues so that they are busy most of the time. As we talked about, it is not possible to run multiple pipelines ( PipelineDecorator.pipeline
) simultaneously in a single process. That's why I had been testing locally launching pipelines in different subprocesses and this way I have managed to run several pipelines concurrently....
If I try to connect a dictionary of type dict[str, list]
with task.connect
, when retrieving this dictionary with task.get_parameter
I get another dictionary dict[str, str]
. Therefore, I see the same behavior using task.connect
:/
Hi AnxiousSeal95 !
Yes, main reason is to unclutter the ClearML Web UI but also free up space on our server (mainly due to the large size of the datasets). Once the models are trained, I want to retrain them periodically, and to do so I would like all the data specifications and artifacts generated during training to be linked to the model found under the " Models" section.
What I propose is somehow similar to the functionality of clearml.Dataset
. These datasets are themselves a task t...
AgitatedDove14 Exactly, I've run into the same problem
Hi Martin,
Actually Task.add_requirements
behaves as I expect, since that part of the code is in the preprocessing script and for that task it does install all the specified packages. So, my question could be rephrased as the following: when working with PipelineController
, is there any way to avoid creating a new development environment for each step of the pipeline?
According to the https://clear.ml/docs/latest/docs/clearml_agent provided in the official ClearML documentatio...
Yep, I've already unmarked the venv caching setting, but still the agent reinstalls all the requirements again.
Maybe it has to do with the fact that I am not working on a Git repository and clearML
is not able to locate the requirements.txt
file?
The scheme is similar to the following:
` main_pipeline
(PipelineDecorator.pipeline)
|
|----------------------------------|
| |
inference_orchestrator_1 inference_orchestrator_2
(PipelineDecorator.component, (PipelineDecorator.component,
acting as a pipeline) acting as a pipeline)
| |
...
Well, I can see the difference here. Using the new pipelines generation the user has the flexibility to play with the returned values of each step. We can process those values before passing them to the next step, so maybe makes little sense to include those callbacks in this case
Mmm well, I can think of a pipeline that could save its state in the instant before the error occurred. So that using some crontab/scheduler the pipeline could be resumed at the point where it was stopped in the case of not having been completed. Is there any functionality like this? Something like PipelineDecorator/PipelineController.resume_from(state_filepath)
?
AgitatedDove14 It's in the configuration file where I specified that information. But I think this error has only appeared since I upgraded to version 1.1.4rc0
Currently I'm working with v1.0.5. Anyway, I found that it is possible to connect the new argument if I store in a variable the arguments returned by task.connect(args)
. I expected that since it is a mutable object it would not be necessary to overwrite args
, but apparently it is required in this version of ClearML.
Hi AgitatedDove14 ,
I have already developed a mock test that can be somewhat similar to the pipeline we are developing. The same problem arises. Only the task is created for the first set of parameters in the for loop. Here, only the configuration text file is created for user 1. Can you reproduce it?
` from clearml import Task
from clearml.automation.controller import PipelineDecorator
@PipelineDecorator.component(
return_values=["admin_config_path"], cache=False, task_type=Task.Task...
is there any git redundancy on your network ? maybe you could configure a fallback server ?
I will ask this to the IT team
Thanks, I'd appreciate it if you let me know when it's fixed :D
So I assume that you mean to report not only the agent's memory usage, but also of all the subprocesses the agent spawns (?)
In fact, the datasets
directory does not even exist
Please let me know as soon as you have something :)
BTW, I would like to mention another problem related to this I have encountered. It seems that arguments of type 'int', 'float' or 'list' (maybe also happens with other types) are transformed to 'str' when passed to a function decorated with PipelineDecorator.component
at the time of calling it in the pipeline itself. Again, is this something intentional?
But what is the name of that API library in order to have access to those commands from Python SDK?
I tried specifying helpers functions but it still gives the same error. If I define a component through the following code:
` from typing import Optional
from clearml.automation.controller import PipelineDecorator
@PipelineDecorator.component(...)
def step_data_loading(path: str, target_dir: Optional[str] = None):
pass Then in the automatically created script I find the following code:
from clearml.automation.controller import PipelineDecorator
def step_data_loading(path: str, target...
Hi AgitatedDove14 , it's nice to know you've already pinpointed the problem! I think the solution you propose is a good one, but does that mean I have to unpack all the dictionary values as parameters of the pipeline function? Wouldn't that make the function too "dirty"? Or do you mean you will soon push a commit that will allow me to keep passing a dictionary and ClearML automatically flatten it?
Yes, I'm working with the latest commit. Anyway, I have tried to run dataset.get_local_copy()
on another machine and it works. I have no idea why this happens. However, on the new machine get_local_copy()
does not return the path I expect. If I have this code:dataset.upload( output_url="/home/user/server_local_storage/mock_storage" )
I would expect the dataset to be stored under the path specified in output_url
. But what I get with get_local_copy()
is the follo...
But how could I know whether an agent is up or not? Is it from the CLI or SDK?
Hi AgitatedDove14 Yes, I think so. When I have more time next week I will take a closer look at it and elaborate an example.
Hi ExasperatedCrab78 ,
Sure! Sorry for the delay. I'm using Chrome Version 98.0.4758.102 (Official Build) (64-bit)
Having the ability to clone and modify the same task over and over again, in principle I would no longer need the multi_instance support feature from PipelineDecorator.pipeline. Is this correct, or are they different things?
I mean what should I write in a script to import the APIClient? (sorry if I'm not explaining myself properly 😅 )
Oddly enough I didn't run into this problem today 🤔 If it happens to me again, I'll return to this thread 🙂