
Reputation
Badges 1
183 × Eureka!In fact, the datasets
directory does not even exist
Well the 'state.json' file is actually removed after the exception is raised
Indeed it does! But what still puzzles me so badly is why I get below path when running dataset.get_local_copy()
on one of the machines of my cluster:/home/user/.clearml/cache/storage_manager/datasets/.lock.000.ds_61ff8d4335dd4b74bd78c3576fa44131.clearml
Why is it pointing to a .lock file?
I am aware of the option to enable virtual environment caching, but that is still very time consuming.
Oh, I see. In the meantime I will duplicate the function and rename it so I can work with a different configuration. I really appreciate your effort as well as having a continuous feedback to keep improving this wonderful library!
Mmm what would be the implications of not being part of the DAG? I mean, how could that step be launched if it is not part of the execution graph?
But I was actually asking about accessing the Pipeline task ID, not the tasks corresponding to the components.
I currently deal with that by skipping the first 5 characters of the path, i. e. the 'file:' part. But I'm sure there is a cleaner way to proceed.
I have also tried with type hints and it still broadcasts to string. Very weird...
SuccessfulKoala55 I have not tried yet with argparse, but maybe I will encounter the same problem
My idea is to take advantage of the capability of getting parameters connected to a task from another task to read the path where the artifacts are stored locally, so I don't have to define it again in each script corresponding to a different task.
Where can I find this documentation?
Yeah, but after doing that a message pops up showing a list of artifacts from the task that could not be deleted
So great! It would be a feature that would make the work much easier instead of having to clone the task and launch it with different parameters. It could even be considered more pythonic. Do you have an immediate solution in mind to keep moving forward before the new release is ready? :)
Oh, I see. I guess somehow I can retrieve that information via Task.logger
, since it is stored in JSON format? Thanks!
AnxiousSeal95 I see. That's why I was thinking of storing the model inside a task just like with the Dataset
class. So that you can either use just the model via InputModel
or the model and all its artifacts via Task.get_task
by using the ID of the task where the model is located.
I would like my cleanup service to remove all tasks older than two weeks, but not the models. Right now, if I delete all tasks the model does not work (as it needs the training tasks). For now, I ...
I mean that I have a script for data preprocessing task where I need the following dependencies:
` import sys
from pathlib import Path
from contextlib import contextmanager
import numpy as np
from clearml import Task
with add_temporary_module_search_path("/home/user/myclearML/"):
from helpers import (
read_netcdf_dataset,
write_records,
) However, the
xarray package is a dependency of the
helpers module which is required by the
read_netcdf_dataset `...
Yes, from archived experiments
Having the ability to clone and modify the same task over and over again, in principle I would no longer need the multi_instance support feature from PipelineDecorator.pipeline. Is this correct, or are they different things?
Well, I need to write boilerplate code to do parsing stuff if I want to use the original values after I connect the dictionary to the task, so it's a bit messy.
Currently I'm using clearml v1.0.5 and clearml-agent v1.0.0
Well, just as you can pass the 'task_type' argument in PipelineDecorator.component
, it might be a good option to pass the rest of the 'Task.init' arguments as they are passed in the original method (without using a dictionary)
Hi AgitatedDove14 ,
Any updates on the new ClearML release that fixes the bugs we mentioned in this thread? :)
Sure, just by changing a few things from the previous example:
` from clearml import Task
task = Task.init()
task.connect({"metrics": ["nmae", "bias", "r2"]})
metrics_names = task.get_parameter("General/metrics")
print(metrics_names)
print(type(metrics_names)) `
Hi AgitatedDove14
Using task.get_parameters
I get the parameters in a dictionary, but the values are still of type 'string'. The quickest solution I can think of is parsing with eval
built-in. wdyt?
Makes sense, thanks!
Sure! That definitely makes sense. Where can I specify callbacks in the PipelineDecorator
API?