yes i actually have been able to turn on caching after rc2 of the agent! been working much better .
I can see agent.vcs_cache.enabled = true
as a printout in the Console, but cannot find docs on how to set this via environment variable, since I'm trying to keep these containers from needing a clearml.conf
file (though I can generate on in the entrypoint script if need be with <EOF>
)
@<1689446563463565312:profile|SmallTurkey79> did you solved this issue with fatal: could not read Username
?
Okay thank you so much
But I think I solve problem with credentials by using clearml_agent v1.8.1rc2
But now I get an issue with local python modules ðŸ«
Even when I set
agent.skip_pip_venv_install = 1
agent.skip_python_env_install = /usr/bin/python
In worker logs I see:
Environment setup completed successfully
Starting Task Execution:
yeah i ended up figuring it out . i think we are in similar situations (private git repo w token) . ill take a look at my config tomorrow but from memory, you have to set your env variables and have an option in your config to force https protocol if you're using a token .
By the way, which agent version are you using? Can you include the complete task log?
so, i got around this with env vars
in my worker entrypoint script , I do
export CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1
export CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=$(which python)
so far it seems that turning off cache like this is my "best option"
Hi @<1689446563463565312:profile|SmallTurkey79> , indeed, you can turn it off by passing this configuration in the config file ( agent.vcs_cache.enabled: false
will also work). By using dynamic env vars, you can also use this env var to set the same value: CLEARML_AGENT__AGENT__VCS_CACHE__ENABLED=false
(see here for more details)
The clone is the default used by git (you can actually see the command in the log)
BTW a new agent version has been released, I'd recommend trying it out
and for what its worth it seems I dont have anything special for agent cloning
i did find agent.vcs_cache.clone_on_pull_fail to be helpful . but yah, updating the agent was the biggest fix
update: ever since turning off git caching, i've had much more stability. i cannot tell whether it's causing a slow down in task execution though - is the clone a shallow one by default?
i ended up pinning the Dockerfile instruction to 1.18 but before that was letting the entrypoint script do the install (so, latest) .
much appreciate the env var tip . that's more elegant than what i did .
since I've turned off caching I've had much better luck . is what I'm experiencing a bug? (bitbucket nor github private repository work on second task per worker)