Can you please attach the code for the pipeline?
Thanks for pointing this out, we will need to update our documentation. Still, if you manually inspect the ~/clearml.conf
file you will see the available configurations
That is not specific enough. Can you show the code? And ideally also the console log of the pipeline
Hey @<1577468626967990272:profile|PerplexedDolphin99> , yes, this method call will help you limit the number of files you have in your cache, but not the total size of your cache. To be able to control the size, Iโd recommend checking the ~/clearml.conf
file in the sdk.storage.cache
section
Hey @<1554275802437128192:profile|CumbersomeBee33> , aborted usually means that someone manually stopped the pipeline or one of it's experiments. Can you provide us with the code you used to run it?
About the first question - yes, it will use the destination URI you set.
About the second point - did you archive or properly delete the experiments?
Yes, that is correct. Btw, not it looks more like my clearml.conf
If your git credentials are stored in the agent's clearml.conf
it means these are a HTTPS username/password pair. But you specified that the package should be downloaded via git ssh, for which I assume you don't have credentials in agent's environment. So it can't authenticate with SSH, and PIP doesn't know how to switch from git+ssh to git+https, because the downloading of the package is done by PIP not by clearml.
And there probably are auth errors if you scroll through the entire log ...
Can you update the clearml version to latest (1.11.1) and see whether the issue is fixed?
Then change from git+ssh
to git+https
Yes, metrics can be saved in both steps and pipelines. As for project dashboards, I think as of now we don't support them in UI for pipelines. But what you can do instead is to run a special "reporting" Task that will query all the pipeline runs from a specific project, and with it you can then manually plot all the important information yourself.
To get the pipeline runs, please see documentation here: [None](https://clear.ml/docs/latest/docs/references/sdk/automation_controller_pipelineco...
Hey @<1574207113163444224:profile|ShallowCoyote86> , what exactly do you mean by "depends on private_repo_b
"? Another question - after you push the changes, do you re-run script_a.py
?
Wait, my config looks a bit different, what clearml package version are you using?
To link a dataset to a task you need to pass the alias=
parameter to the Dataset.get
. See here: https://clear.ml/docs/latest/docs/clearml_data/clearml_data_sdk#accessing-datasets
Hey @<1644147961996775424:profile|HurtStarfish47> , you can use S3 for debug images specifically , see here: https://clear.ml/docs/latest/docs/references/sdk/logger/#set_default_upload_destination but the metrics (everything you report like scalars, single values, histograms, and other plots) are stored in the backend. The fact that you are almost running out of storage could be because of either t...
Hey @<1639799308809146368:profile|TritePigeon86> , given that you want to retry on connection error, wouldn't it be easier to use retry_on_failure
from PipelineController
/ PipelineDecorator.pipeline
None ?
Hey @<1681836314334334976:profile|GrotesqueSeaturtle83> , yes, it is possible to do so, but you must configure the docker --entrypoint
argument (as part of the docker_arguments
) and the docker image of for said task. In general this isn't a recommended approach. Rather than that, prefer a setup where your task code invokes the functionalities defined in other scripts that are pre-baked in the image.
See docker args here:
[None](https://clear.ml/docs/latest/docs/references/sdk/task/...
That seems strange. Could you provide a short code snippet that reproduces your issue?
You can create a new dataset and specify the parent datasets as all the previous ones. Is that something that would work for you ?
Could you please run the misbehaving example, try to add a breakpoint in clearml/backend_interface/task/task.py
in Task.update_output_model
on the line with url = output_model.update_weights(
, and tell me what the value of model_path
is? In case you're using virtual environments, clearml library should be installed somewhere in <virtual env directory>/lib/python3.10/site-packages/clearml/
The line before the last in your code snippet above. pipe.start_locally
.
To my knowledge, no. You'd have to create your own front-end and use the model served with clearml-serving via an API
For on-premise deployment with premium features we have the enterprise plan ๐
Hey @<1564422650187485184:profile|ScaryDeer25> , we just released clearml==1.11.1rc2
which should solve the compatibility issues for lightning >= 2.0. Can you install it and check whether it solves your problem?
clearml-data
also supports glob patterns, so if you have your dataset files in the same directory as the experiment code, you can do something like clearml-data add --files *.csv
and only add the CSV files.
There's no .gitignore-like functionality because clearml-data
is not meant to track everything, and you need to be deliberate in what exactly you're adding. Hope this clarifies things.
I see you want to use the services
queue for both the pipeline controller and pipeline steps, but you have only one worker/agent listening to this queue. In this case you need at least 2 agents listening to the services queue. Try spawning an additional agent that listens to this queue and let me know how it goes .
What happens if you comment or remove the pipe.set_default_execution_queue('default')
and use run_locally
instead of start_locally
?
Because in the current setup, you are basically asking to run the pipeline controller task locally, while the rest of the steps need to run on an agent machine. If you do the changes I suggested above, you will be able to run everything on your local machine.
Hey @<1582542029752111104:profile|GorgeousWoodpecker69> can you please tell whether you're running this jupyter notebook as part of a repo or as a standalone file, and what command did you run to launch your clearml-agent?