Reputation
Badges 1
183 × Eureka!Thanks, I'd appreciate it if you let me know when it's fixed :D
Oh, I see. In the meantime I will duplicate the function and rename it so I can work with a different configuration. I really appreciate your effort as well as having a continuous feedback to keep improving this wonderful library!
Well, this is just a mock example 🙂 . In the real application I'm working on there will be more than one configuration file (in principle one for the data and one for the DL model). Regarding the fix, I am not in a hurry at the moment. I'll happily wait for tomorrow (or the day after) when the commit is pushed!
I don't know if you remember the need I had some time ago to launch the same pipeline through configuration. I've been thinking about it and I think PipelineController fits my needs better than PipelineDecorator in that respect.
What exactly do you mean by that? From VS Code I execute the following script, and then the agents take care of executing the code remotely:
` import pandas as pd
from clearml import Task, TaskTypes
from clearml.automation.controller import PipelineDecorator
CACHE = False
@PipelineDecorator.component(
name="Wind data creator",
return_values=["wind_series"],
cache=CACHE,
execution_queue="data_cpu",
task_type=TaskTypes.data_processing,
)
def generate_wind(start_date: st...
Mmm that's weird. Because I can see the type hints in the function's arguments of the automatically generated script. So, maybe I'm doing something wrong or it's a bug, since they have been passed to the created step (I'm using clearml version 1.1.2 and clearml-agent version 1.1.0).
AgitatedDove14 BTW, I got the notification from GitHub telling me you had committed the fix and I went ahead. After testing the code again, I see the task parameter dictionary has been removed properly (now it has been broken down into flat parameters). However, I still have the same problem with duplicate tasks, as you can see in the image.
Yep, I've already unmarked the venv caching setting, but still the agent reinstalls all the requirements again.
Maybe it has to do with the fact that I am not working on a Git repository and clearML is not able to locate the requirements.txt file?
AgitatedDove14 So did you get the same results without step duplication?
Mmm well, I can think of a pipeline that could save its state in the instant before the error occurred. So that using some crontab/scheduler the pipeline could be resumed at the point where it was stopped in the case of not having been completed. Is there any functionality like this? Something like PipelineDecorator/PipelineController.resume_from(state_filepath) ?
AgitatedDove14 Exactly, I've run into the same problem
Indeed it does! But what still puzzles me so badly is why I get below path when running dataset.get_local_copy() on one of the machines of my cluster:/home/user/.clearml/cache/storage_manager/datasets/.lock.000.ds_61ff8d4335dd4b74bd78c3576fa44131.clearml
Why is it pointing to a .lock file?
Mmm but what if the dataset size is too large to be stored in the .cache path? It will be stored there anyway?
Yes, from archived experiments
Sure, converting pipelines into components also works for me (ignoring still having to fix the problem with LazyEvalWrapper return values). But this way some interesting features of the pipeline are missing, such as displaying the step execution DaG in the PLOTS tab .
Hi AgitatedDove14 , great, glad it was fixed quickly!
By the way, before releasing version 1.1.3 you might want to take a look at this mock example. I'm trying to run the same pipeline (with different configurations) in a single for loop, as you can see below:
` from clearml import Task
from clearml.automation.controller import PipelineDecorator
@PipelineDecorator.component(return_values=["msg"], execution_queue="myqueue1")
def step_1(msg: str):
msg += "\nI've survived step 1!"
re...
Makes sense, thanks!
I currently deal with that by skipping the first 5 characters of the path, i. e. the 'file:' part. But I'm sure there is a cleaner way to proceed.
Okay! I'll keep an eye out for updates.
By adding the slash I have been able to see that indeed the dataset is stored in output_url . However, when calling finalize , I get the same error. And yes, I have installed the version corresponding to the last commit :/
But what is the name of that API library in order to have access to those commands from Python SDK?
Perfect, that's exactly what I was looking for 🙂 Thanks!
Oddly enough I didn't run into this problem today 🤔 If it happens to me again, I'll return to this thread 🙂
Well the 'state.json' file is actually removed after the exception is raised
Yes, I'm working with the latest commit. Anyway, I have tried to run dataset.get_local_copy() on another machine and it works. I have no idea why this happens. However, on the new machine get_local_copy() does not return the path I expect. If I have this code:dataset.upload( output_url="/home/user/server_local_storage/mock_storage" )I would expect the dataset to be stored under the path specified in output_url . But what I get with get_local_copy() is the follo...
Mmmm you are right. Even if I had 1000 components spread in different project modules, only those components that are imported in the script where the pipeline is defined would be included in the DAG plot, is that right?
I have found it is not possible to start a pipeline B after a pipeline A. Following the previous example, I have added one more pipeline to the script:
` from clearml import Task
from clearml.automation.controller import PipelineDecorator
@PipelineDecorator.component(return_values=["msg"], execution_queue="model_trainings")
def step_1(msg: str):
msg += "\nI've survived step 1!"
return msg
@PipelineDecorator.component(return_values=["msg"], execution_queue="model_trainings")
def st...
My guess is to manually read and parse the string that clearml-agent list returns, but I'm pretty sure there's a cleaner way to do it, isn't there?
I tested cache=False and I still get the same error 😕 In the dashboard the task corresponding to step_two does not appear duplicated, I assume the task is being launched sequentially. I'm going to prepare a more elaborate example to see what happens. Currently I can't run PipelineDecorator.debug_pipeline() because I need at least two devices to read some data and process it on the other one.