Hi @<1523702000586330112:profile|FierceHamster54> ! This is currently not possible, but I have a workaround in mind. You could use the artifact_serialization_function parameter in your pipeline. The function should return a bytes stream of the zipped content of your data with whichever compression level you have in mind.
If I'm not mistaken, you wouldn't even need to write a deserialization function in your case, because we should be able to unzip your data just fine.
Wdyt?
@<1523703472304689152:profile|UpsetTurkey67> can you please open a Github issue as well, so we can better track this one?
Hi PricklyRaven28 ! What dict do you connect? Do you have a small script we could use to reproduce?
Hi BoredBat47 ! What jsonschema version are you using?
Hi @<1523701083040387072:profile|UnevenDolphin73> ! Steps can be cached using the cache=True arg, so when a new step is added, the other steps will not be ran (only their outputs will be fetched). If a step is modified, then clearml will recognise that and not use the cached result. Of course, the whole pipeline will have to be reran, but the execution should be quick for the cached steps.
The pipeline can't be modified while it is running.
Does this help your case?
Can you please provide a minimal example that may make this happen?
You should alter the name (or else the model will be overwritten)
@<1702492411105644544:profile|YummyGrasshopper29> you could try adding the directory you are starting the pipeline with to the python path. then you would run the pipeline like this:
PYTHONPATH="${PYTHONPATH}:/path/to/pipeline_dir" python my_pipeline.py
Hi @<1811208768843681792:profile|BraveGrasshopper38> ! Consider setting CLEARML_EXTRA_PYTHON_PACKAGES to the right packages if you haven't already. Reference: None
Hi @<1626028578648887296:profile|FreshFly37> ! You can get the version by doing:
p = Pipeline.get(...)
p._task._get_runtime_properties().get("version")
We will make the version more accessible in a future version
Hi @<1545216070686609408:profile|EnthusiasticCow4> ! That's correct. The job function will run in a separate thread on the machine you are running the scheduler from. That's it. You can create tasks from functions tho using backend_interface.task.populate.CreateFromFunction.create_task_from_function
Hi @<1643060801088524288:profile|HarebrainedOstrich43> ! At the moment, we don't support default arguments that are typed via a class implemented in the same module as the function.
The way pipelines work is: we copy the code of the function steps (eventually their decorator's as well if declared in the same file), then we copy all the imports in the module. Problem is, we don't copy classes.
You could have your enum in a separate file, import it and it should work
Hi @<1626028578648887296:profile|FreshFly37> ! Indeed, the pipeline gets tagged once it is running. Actually, it just tags itself. That is why you are encountering this issue. The version is derived in 2 ways: either you manually add the version using the version argument in the PipelineController , or the pipeline fetches the latest version out of all the pipelines that have ran, and auto-bumps that.
Please reference this function: [None](https://github.com/allegroai/clearml/blob/05...
Hi @<1545216070686609408:profile|EnthusiasticCow4> ! This is a known bug, we will likely fix it in the next version
Actually, datasets should have an automatic preview...
Hi @<1643060801088524288:profile|HarebrainedOstrich43> ! Thank you for reporting. We will get back to you as soon as we have something
Hi @<1671689458606411776:profile|StormySeaturtle98> ! Do you have a sample snippet that could help us diagnose this problem?
Hi @<1545216070686609408:profile|EnthusiasticCow4> !
So you can inject new command line args that hydra will recognize.
This is true.
However, if you enable _allow_omegaconf_edit_: True, I think ClearML will "inject" the OmegaConf saved under the configuration object of the prior run, overwriting the overrides
This is also true.
Hi PanickyMoth78 ! I ran the script and yes, it does take a lot more memory than it should. There is likely a memory leak somewhere in our code. We will keep you updated
FierceHamster54 As long as you are not forking, you need to use Task.init such that the libraries you are using get patched in the child process. You don't need to specify the project_name , task_name or outpur_uri . You could try locally as well with a minimal example to check that everything works after calling Task.init .
@<1526734383564722176:profile|BoredBat47> How would you connect with boto3 ? ClearML uses boto3 as well, what it basically does is getting the key/secret/region from the conf file. After that it opens a Session with the credentials. Have you tried deleting the region altogether from the conf file?
@<1654294828365647872:profile|GorgeousShrimp11> Any change your queue is actually named megan-testing and not megan_testing ?
Hi @<1543766544847212544:profile|SorePelican79> ! You could use the following workaround:
from clearml import Task
from clearml.binding.frameworks import WeightsFileHandler
import torch
def filter_callback(
callback_type: WeightsFileHandler.CallbackType,
model_info: WeightsFileHandler.ModelInfo,
):
print(model_info.__dict__)
if (
callback_type == WeightsFileHandler.CallbackType.save
and "filter_out.pt" in model_info.local_model_path
):
retu...
Hi @<1566596968673710080:profile|QuaintRobin7> ! Sometimes, ClearML is not capable of transforming matplotlib plots to plotly , so we report the plot as an image to Debug Samples. Looks like report_interactive=True makes the plot unparsable
Hi @<1523711002288328704:profile|YummyLion54> ! By default, we don't upload the models to our file server, so in the remote run we will try to pull the file from you local machine which will fail most of the time. Specify the upload_uri to the api.files_server entry in your clearml.conf if you want to upload it to the clearml server, or any s3/gs/azure links if you prefer a cloud provider
UnevenDolphin73 Yes it makes sense. At the moment, this is not possible. When using use_current_task=True the task gets attached to the dataset and moved under dataset_project/.datasets/dataset_name . Maybe we could make the task not disappear from its original project in the near future.