Reputation
Badges 1
183 × Eureka!For any reason I can't get the values in their original types. Only the dictionary keys are returned as the raw nested dictionary, but the values remain casted.
Mmm but what if the dataset size is too large to be stored in the .cache path? It will be stored there anyway?
AgitatedDove14 Oops, something still seems to be wrong. When trying to retrieve the dataset using get_local_copy() I get the following error:
` Traceback (most recent call last):
File "/home/user/myproject/lab.py", line 27, in <module>
print(dataset.get_local_copy())
File "/home/user/.conda/envs/myenv/lib/python3.9/site-packages/clearml/datasets/dataset.py", line 554, in get_local_copy
target_folder = self._merge_datasets(
File "/home/user/.conda/envs/myenv/lib/python3.9/site-p...
By adding the slash I have been able to see that indeed the dataset is stored in output_url . However, when calling finalize , I get the same error. And yes, I have installed the version corresponding to the last commit :/
In fact, the datasets directory does not even exist
Yes, I'm working with the latest commit. Anyway, I have tried to run dataset.get_local_copy() on another machine and it works. I have no idea why this happens. However, on the new machine get_local_copy() does not return the path I expect. If I have this code:dataset.upload( output_url="/home/user/server_local_storage/mock_storage" )I would expect the dataset to be stored under the path specified in output_url . But what I get with get_local_copy() is the follo...
Well the 'state.json' file is actually removed after the exception is raised
Sure, here is a trivial example:from clearml import Dataset dataset = Dataset.create(dataset_name="Dataset_v1.1.3", dataset_project="Mocks") dataset.finalize() loaded_dataset = Dataset.get(dataset_id=dataset.id)
Thanks for the background. I now have a big picture of the process ClearML goes through. It was helpful in clarifying some of the questions that I didn't know how to ask properly. So, the idea is that a base task is already stored on the ClearML server for later use in a production environment. This is because such a task will always be created during the model development process.
Going back to my initial question, as far as I understood, if the environment caching option is ena...
Great! Thanks for the heads up!
AgitatedDove14 In the 'status.json' file I could see the 'is_dirty' flag is set to True
I can't figure out what might be going on
Thanks to you for fixing it so quickly!
Indeed it does! But what still puzzles me so badly is why I get below path when running dataset.get_local_copy() on one of the machines of my cluster:/home/user/.clearml/cache/storage_manager/datasets/.lock.000.ds_61ff8d4335dd4b74bd78c3576fa44131.clearml
Why is it pointing to a .lock file?
Well I tried several things but none of them have worked. I'm a bit lost
Thanks, I'd appreciate it if you let me know when it's fixed :D
What exactly do you mean by that? From VS Code I execute the following script, and then the agents take care of executing the code remotely:
` import pandas as pd
from clearml import Task, TaskTypes
from clearml.automation.controller import PipelineDecorator
CACHE = False
@PipelineDecorator.component(
name="Wind data creator",
return_values=["wind_series"],
cache=CACHE,
execution_queue="data_cpu",
task_type=TaskTypes.data_processing,
)
def generate_wind(start_date: st...
Are you suggesting just taking the read_and_process_file function out of the read_dataset method, or maybe decoupling the read_dataset method from the NetCDFReader class so it is not pickle along with the class instance itself?
As for the second option, you mean create the task in the __init__ method of the NetCDFReader class?
It would be a great idea to make the Task picklelizable, since at the moment what are the most frequently used options for integrating ...
Yep, I've already unmarked the venv caching setting, but still the agent reinstalls all the requirements again.
Maybe it has to do with the fact that I am not working on a Git repository and clearML is not able to locate the requirements.txt file?
Hi AgitatedDove14 Yes, I think so. When I have more time next week I will take a closer look at it and elaborate an example.
AgitatedDove14 By adding PipelineDecorator.run_locally() everything seems to work perfectly. This is what I expect the experiment listing to look like when the agents are the ones running the code. With this, I'm pretty sure the error search can be narrowed down to the agents' code.
Or perhaps the complementary scenario with a continue_on_failed_steps parameter which may be a list containing only the steps that can be ignored in case of failure.
BTW, I would like to mention another problem related to this I have encountered. It seems that arguments of type 'int', 'float' or 'list' (maybe also happens with other types) are transformed to 'str' when passed to a function decorated with PipelineDecorator.component at the time of calling it in the pipeline itself. Again, is this something intentional?
Mmm I see. So the agent is taking the parameters from the base task registered in the server. Then if I call task.get_parameter_as_dict for a task that has not been executed by an agent, should I get the original types of the values?
AgitatedDove14 After checking, I discovered that apparently it doesn't matter if each pipeline is executed by a different worker, the error persists. Honestly this has me puzzled. I'm really looking forward to getting this functionality right because it's an aspect that would make ClearML shine even more.
Hi Martin,
Actually Task.add_requirements behaves as I expect, since that part of the code is in the preprocessing script and for that task it does install all the specified packages. So, my question could be rephrased as the following: when working with PipelineController , is there any way to avoid creating a new development environment for each step of the pipeline?
According to the https://clear.ml/docs/latest/docs/clearml_agent provided in the official ClearML documentatio...
Brilliant, that worked like a charm!
But this path actually does not exist in my system, so how should I fix that?