Basically when there are occasionally extreme values (i.e. most values fall in [0, 50] range, and one value suddenly falls in 50e+12 range), the plotting library (matplotlib or ClearML, unsure) hangs for a really long time
Ah I see, if the pipeline controller begins in a Task it does not add the tags to it…
Here's a full description of the layout:
Remote agent + entire ClearML docker suite running on host A. Host A also has a /data/clearml
folder mounted to it and to it's docker containers (I've edited the docker-compose
to add this mount point) Connect to host A, use StorageManager on the /data/clearml
folder for some early troubles (e.g. long .list
call) Use the same connection to run a task with execute_remotely
and download_folder
and see it crash :disapp...
But it does work on linux 🤔 I'm using it right now and the environment variables are not defined in the terminal, only in the .env
🤔
I'm saying it's a bug
Anyway sounds good! 🙂
I understand, but then the toml file needs to be parsed to ensure poetry is used. It's just a tool entry in the pyproject.toml.
I know, that should indeed be the default behaviour, but at least from my tests the use of --python ...
was consistent, whereas for some reason this old virtualenv decided to use python2.7 otherwise 🤨
It is. Let me see what else I have set up for MinIO in configs, one moment
It seems that the agent uses the remote repository 's lock file. We've removed and renamed the file locally (caught under local changes), but it still installs from the remote lock file 🤔
Btw TimelyPenguin76 this should also be a good starting point:
First create the target directory and add some files:sudo mkdir /data/clearml sudo chmod 777 -R /data/clearml touch /data/clearml/foo touch /data/clearml/bar touch /data/clearml/baz
Then list the files using the StorageManager. It shouldn't take more than a few miliseconds.` from clearml import StorageManager
%%timeit
StorageManager.list("/data/clearml")
-> 21.2 s ± 328 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) `
StaleButterfly40 what use case are you looking for? I've used environment variables in the config file and then I can overwrite them in os.environ
before ClearML loads the config
Following up on that (I don't think the K8s helm chart for 1.7.0 is out yet SlimyDove85 , is it?) - but what's the recommended way to backup the mongodb before upgrading on K8s?
That doesn't make sense? 🤔
Maybe I was not clear, but it's a simple part of the config file.
nevermind! Found and answered (solution in the issue linked above)
Heh, my bad, the term "user" is very much ingrained in our internal way of working. You can think of it as basically any technically-inclined person in your team or company.
Indeed the options in the WebUI are too limited for our use case, so we're developed "apps" that take a yaml configuration file and build a matching pipeline.
With that, our users do not need to code directly, and we can offer much more fine control over the pipeline.
As for the imports, what I meant is that I encounter...
The api.files_server
is set to the MinIO endpoint s3://ip:9000/clearml (both locally and remotely) The sdk.development.default_output_uri
is set to the MinIO endpoint (both locally and remotely) When we call Task.init
I do not set the output_uri
at all I get the logger directly with task.get_logger()
We have an internal mono-repo and some of the packages are required - they’re all available correctly for the controller, only some are required for the individual tasks, but the “magic” doesn’t happen 😞
That is, the controller does not identify them as a requirement, so they’re not installed in the tasks environment.
I know the ClearML enterprise offers a vault.
If these are static-ish, you can set them directly in the agent's config file.
If not, what we did was that before executing remotely, we uploaded environment variables of interest as parameters, and then loaded them in the remote task.
These can then be overwritten with *** after loading them.
Generally the StorageManager seems a bit slow, even a simple StorageManager.list(...)
on a local path seems to take a long time
Exactly; the cloud instances (that are run with clearml-agent
) should have that clearml.conf
+ any changes specified in extra_clearml_configuration
for the scaler
I think the environment variables path might work for you then?
You'd set your config withuse_credentials_chain: ${CREDENTIALS_CHAIN}
Then in Python you could os.environ['CREDENTIALS_CHAIN'] = True/False
before you make any calls to ClearML?