Reputation
Badges 1
662 × Eureka!Also I can't select any tasks from the dashboard search results 😞
Using an on-perm clearml server, latest published version
SuccessfulKoala55 help me out here 🙂
It seems all the changes I make in the AWS autoscaler apply directly to the virtual environment set for the autoscaler, but nothing from that propagates down to the launched instances.
So e.g. the autoscaler environment has poetry
installed, but then the instance fails because it does not have it available?
This could be relevant SuccessfulKoala55 ; might entail some serious bug in ClearML multiprocessing too - https://stackoverflow.com/questions/45665991/multiprocessing-returns-too-many-open-files-but-using-with-as-fixes-it-wh
Is there a way to specify that flag within the config file, SuccessfulKoala55 ?
We're not using the docker setup though. The CLI run by the autoscaler is python -m clearml_agent --config-file /root/clearml.conf daemon --queue aws_small
, so no docker
Should this be under the clearml
or clearml-agent
repo?
I'm not too worried about the dataset appearing (or not) in the Datasets
tab. I would like it (the original task ) to to not disappear from the original project I assigned it to
Not necessarily on the same branch, no
Another side effect btw is that some of our log files (we add a file handler to the logger) end up at 0 bytes. This specifically happens with Ray and ClearML and does not reproduce locally
From the traceback ( backend_interface/task/task.py, line 178, in __init__
), notice it's not Task.init
TimelyPenguin76 here's the full log (took a moment to anonynomize completely):
`
Using environment access key CLEARML_API_ACCESS_KEY=xxx
Using environment secret key CLEARML_API_SECRET_KEY=********
Current configuration (clearml_agent v1.3.0, location: /tmp/.clearml_agent.zs4e7egs.cfg):
sdk.storage.cache.default_base_dir = ~/.clearml/cache
sdk.storage.cache.size.min_free_bytes = 10GB
sdk.storage.direct_access.0.url = file://*
sdk.metrics.file_history_size = 100
sdk.m...
PricklyRaven28 That would be my fallback, it would make development much slower (having to build containers with every small change)
Since this is a single process, most of these are only needed once when our "initializer" task starts and loads.
Yeah that works too. So one can override the queue ID but not the worker 🤔
Not that I recall
Odd; switching to virtual environment results infatal: could not read Username for '
': terminal prompts disabled
even though it does earlier show that:agent.git_user = xxx
That's enabled; I was aiming if there are flags to add to pip install
CLI, such as --no-use-pep517
Thanks! That's what I thought, but then I get2021-12-21 22:08:35,376 - clearml.storage - ERROR - Failed uploading: Parameter validation failed: Invalid bucket name "": Bucket name must match the regex "^[a-zA-Z0-9.\-_]{1,255}$" or be an ARN matching the regex "^arn:(aws).*:(s3|s3-object-lambda):[a-z\-0-9]*:[0-9]{12}:accesspoint[/:][a-zA-Z0-9\-.]{1,63}$|^arn:(aws).*:s3-outposts:[a-z\-0-9]+:[0-9]{12}:outpost[/:][a-zA-Z0-9\-]{1,63}[/:]accesspoint[/:][a-zA-Z0-9\-]{1,63}$"
In the Profile section, yes, they are well defined (bucket, secret, key, and endpoint)
I also tried adding gent.package_manager.system_site_packages = true
to ensure these virtual environments have access btw, still no avail
Maybe this is part of the paid version, but would be cool if each user (in the web UI) could define their own secrets, and a task could then be assigned to some user and use those secrets during boot?
Thanks, that's what I thought - so I'm missing something else in the installation. I'll dig further 🙂
Or do you mean the contents of the configuration, probably :face_palm: ... one moment
I'm guessing that's not on pypi yet?
That's fine for the current use-case I believe.
Once the team is happy with the logging functionality, we'll move on to remote execution and things will update.
That's what I found as well, but it did not like it after all (boto is fine with it, but underlying urllib
and requests
were not?)
It's fine -- I see the added benefit in making sure the users set up their clearml.conf
and I've made a script to edit it to our needs as part of the installation process 🙂 Thanks Martin!