Reputation
Badges 1
662 × Eureka!We just inherit from logging.Handler
and use that in our logging.config.dictConfig
; weird thing is that it still logs most of the tasks, just not the last one?
I'll try with 1.1.5 first, then 1.1.6rc0
TimelyPenguin76 I added pip install --update clearml-agent
to the extra_vm_bash_script
for the autoscaler, that should at least guarantee the latest clearml agent is used on the instance, right?
Odd; switching to virtual environment results infatal: could not read Username for '
': terminal prompts disabled
even though it does earlier show that:agent.git_user = xxx
That's enabled; I was aiming if there are flags to add to pip install
CLI, such as --no-use-pep517
I'm trying, let's see; our infra person is away on holidays :X Thanks! Uh, which configuration exactly would you like to see? We're running using the helm charts on K8s, so I don't think I have direct access to the agent configuration/update it separately?
Nope, no .netrc
defined anywhere, really (+I've abandoned the use of docker for the autoscaler as it complicates things, at least for now)
Sounds like a nice idea π
Follow-up; any ideas how to avoid PEP 517 with the auto scaler? π€ Takes a long time to build the wheels
That was a good idea, unfortunately did not help too much, but I think I may have a found a work around, thanks!
I'm using some old agent I fear, since our infra person decided to use chart 3.3.0 π
I'll try with the env var too. Do you personally recommend docker over the simple AMI + virtual environment?
More complete log does not add much information -Cloning into '/root/.clearml/venvs-builds/3.10/task_repository/xxx/xxx'... fatal: could not read Username for '
': terminal prompts disabled fatal: clone of '
` ' into submodule path '/root/.clearml/venvs-builds/3.10/task_repository/...
Then the username and password would be visible in the autoscaler task π
But it should work out of the box, as it does work like that out of the box also regardless of ClearML. The user and personal access token are used as is and it propagates down to submodules, since those are simply another git repository.
I've further checks on a different machine and it works as well π€
We have a read-only user with personal access token for these things, works seamlessly throughout and in our current on premise servers... So perhaps something missing in the autoscaler definitions?
Hurrah! Addedgit config --system credential.helper 'store --file /root/.git-credentials'
to the extra_vm_bash_script
and now it works
(logs the given git credentials in the store file, which can then be used immediately for the recursive calls)
Different AMI image/installing older Python instances that don't enforce this...
For future reference though, the environment variable should be PIP_USE_PEP517=false
I just set the git credentials in the clearml.conf
and it works out of the box
TimelyPenguin76 here's the full log (took a moment to anonynomize completely):
`
Using environment access key CLEARML_API_ACCESS_KEY=xxx
Using environment secret key CLEARML_API_SECRET_KEY=********
Current configuration (clearml_agent v1.3.0, location: /tmp/.clearml_agent.zs4e7egs.cfg):
sdk.storage.cache.default_base_dir = ~/.clearml/cache
sdk.storage.cache.size.min_free_bytes = 10GB
sdk.storage.direct_access.0.url = file://*
sdk.metrics.file_history_size = 100
sdk.m...
AgitatedDove14 The keys are there, and there is no specifically defined user in .gitmodules
:[submodule "xxx"] path = xxx url =
I believe this has to do with how ClearML sets up the git credentials perhaps?
https://github.com/allegroai/clearml-agent/pull/98 AgitatedDove14 π
Running a self-hosted server indeed. It's part of a code that simply adds or uploads an artifact π€
1.8.3; what about when calling task.close()
? We suddenly have a need to setup our logging after every task.close()
call
FWIW, we prefer to set it in the agentβs configuration file, then itβs all automatic
I guess I'll have to rerun the experiment without tags for this?
I'm also getting the following warning, I guess it's some ClearML dependency?IPython could not be loaded!
I know the ClearML enterprise offers a vault.
If these are static-ish, you can set them directly in the agent's config file.
If not, what we did was that before executing remotely, we uploaded environment variables of interest as parameters, and then loaded them in the remote task.
These can then be overwritten with *** after loading them.
Well you could start by setting the output_uri
to True
in Task.init
.
So basically I'm wondering if it's possible to add some kind of small hierarchy in the artifacts, be it sections, groupings, tabs, folders, whatever.
We're still working these quirks out. But one issue after we changed the AMI is that the VPC (SubnetId?) was missing from the instance so it could not reach the ClearML API server.
I think maybe the autoscaler service is missing some additional settings...