Logs shows me that key is mounted to the docker container
How are you mounting the credentials?
What version of ClearML-Agent
are you using?
REMOTE MACHINE:
- git ssh key is located at ~/.ssh/id_rsa
Is this also mounted into the docker itself?
EnviousPanda91
in your clearml.conf I think you are missing a sectionagent.git_user="" agent.git_pass="" agent.git_host="" agent.force_git_ssh_protocol: true
Hi CostlyOstrich36
How are you mounting the credentials?
Is this also mounted into the docker itself?
as I wrote above, it is mounted automatically:'-v', '/tmp/clearml_agent.ssh.kqzj9sky:/root/.ssh
What version of
ClearML-Agent
are you using?
1.3.0
AgitatedDove14 sorry, no, in fact my configuration looks like:
` ...
agent.git_user=""
agent.git_pass=""
agent.git_host=""
agent.package_manager.extra_index_url= [
]
agent {
worker_id: ""
worker_name: ""
force_git_ssh_protocol: true
... `
I’ve found the answer:
https://github.com/allegroai/clearml-agent/issues/42#issuecomment-887331420
when I restart the agent, it works fine, but on the second launch docker does not mount the ssh keys folder:'-v', '/tmp/clearml_agent.ssh.rbw8o0t7:/root/.ssh',
I don’t understand why. AgitatedDove14 JitteryCoyote63 could you explain the logic behind that? CLEARML_AGENT_DISABLE_SSH_MOUNT variable is not set.
So it fails with this log message:
` ...
Using cached repository in "/root/.clearml/vcs-cache/<MY_REPO>.git.893c8c47c9813c27eb1fe8d0aeb77a11/<MY_REPO>.git"
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
error: Could not fetch origin
Repository cloning failed: Command '['git', 'fetch', '--all', '--recurse-submodules']' returned non-zero exit status 1.
clearml_agent: ERROR: Failed cloning repository.
- Make sure you pushed the requested commit:
(repository='git@github.com:<GIT_USER>/<MY_REPO>.git', branch='master', commit_id='46c86354e58e50a811e870c7b163ea5734499a67', tag='', docker_cmd='nvidia/cuda:10.2-cudnn7-runtime-ubuntu18.04', entry_point='test.py', working_dir='bench') - Check if remote-worker has valid credentials [see worker configuration file] `
EnviousPanda91 the host checks if you have a .ssh folder on the machine, if you do, it will copy+mount it into the container, then it will delete the copy when the container is down.
Specifically /tmp/clearml_agent.ssh.rbw8o0t7
is the copy of the .ssh that the agent created, and now it is mounting it into the container
AgitatedDove14
Specifically
/tmp/clearml_agent.ssh.rbw8o0t7
is the copy of the .ssh that the agent created, and now it is mounting it into the container
but why is it mounted only once? second and following containers do not mount the folder
but why is it mounted only once?
Are you saying the second time this line is missing? this is very strange...
Can you send the full Task log?
AgitatedDove14
Are you saying the second time this line is missing?
Yes.
Can you send the full Task log?
I will send the log in direct messages.
EnviousPanda91 Hi! Did you manage to solve the issue? I've encountered the same behavior when the agent can't create a temp copy of .ssh folder for the second and all the following tasks: Failed creating temporary copy of ~/.ssh for git credential
AgitatedDove14 Sorry for mention, but wanted to ask the same question. Did you get to the bottom of the issue above? When the .ssh folder could be copied only for the first task after the agent daemon has been started but not for the following ones (it complains about not being able to create a temp copy until I restart the agent)
HI BurlyRaccoon64
Yes, we did the latest clearml-agent solves the issue, please try:
'pip3 install -U --pre clearml-agent'
Thanks! That worked. If you don't mind could you point me in the direction where I can find the commit that resolved it?