I am struggling with configuring ssh authentication in docker mode
GentleSwallow91 Basically the agent will automatically mount the .ssh into the container , just make sure you set the following in the clearml.conf:force_git_ssh_protocol: true
https://github.com/allegroai/clearml-agent/blob/178af0dee84e22becb9eec8f81f343b9f2022630/docs/clearml.conf#L30
Hi AgitatedDove14
Either I do smth wrong or it works only in theory 😃
Here are my mounts in agent clearml.confsdk_cache: "/clearml_agent_cache" ssh_folder: "/home/testuser/.ssh" pip_cache: "/home/testuser/.cache/pip" vcs_cache: "/home/testuser/.clearml/vcs-cache" venv_build: "/home/testuser/.clearml/venvs-builds" pip_download: "/home/testuser/.clearml/pip-download-cache"
testuser is inside Docker container
and I run agent from local user and I would expect that settings to have effect -v /home/localuser/.ssh:/home/testuser/.ssh
but that does not happen...
Please advise
Hi Martin. Sorry - missed your reply.
Yeap I am aware that docker_internal_mounts is inside agent section.
Here is the actual docker command from the logINFO Executing: ['docker', 'run', '-t', '--gpus', '"device=0"', '-v', '/tmp/ssh-XXXXXXnfYTo5/agent.8946:/tmp/ssh-XXXXXXnfYTo5/agent.8946', '-e', 'SSH_AUTH_SOCK=/tmp/ssh-XXXXXXnfYTo5/agent.8946', '-l', 'clearml-worker-id=agent-gpu:gpu0', '-l', 'clearml-parent-worker-id=agent-gpu:gpu0', '-e', 'CLEARML_WORKER_ID=agent-gpu:gpu0', '-e', 'CLEARML_DOCKER_IMAGE=torch2022', '-e', 'CLEARML_TASK_ID=089d2a5d2b59443db53dfb5a884eb7f3', '-v', '/home/localuser/.gitconfig:/root/.gitconfig', '-v', '/tmp/.clearml_agent.twer6v9b.cfg:/tmp/clearml.conf', '-e', 'CLEARML_CONFIG_FILE=/tmp/clearml.conf', '-v', '/home/localuser/.clearml/apt-cache:/var/cache/apt/archives', '-v', '/home/localuser/.clearml/pip-cache:/home/testuser/.cache/pip', '-v', '/home/localuser/.clearml/pip-download-cache:/home/testuser/.clearml/pip-download-cache', '-v', '/home/localuser/.clearml/cache:/clearml_agent_cache', '-v', '/home/localuser/.clearml/vcs-cache:/home/testuser/.clearml/vcs-cache', '-v', '/home/localuser/.clearml/venvs-cache:/root/.clearml/venvs-cache', '--rm', 'torch2022', 'bash', '-c', 'echo \'Binary::apt::APT::Keep-Downloaded-Packages "true";\' > /etc/apt/apt.conf.d/docker-clean ; chown -R root /root/.cache/pip ; export DEBIAN_FRONTEND=noninteractive ; export CLEARML_APT_INSTALL="$CLEARML_APT_INSTALL libsm6 libxext6 libxrender-dev libglib2.0-0" ; [ ! -z $(which git) ] || export CLEARML_APT_INSTALL="$CLEARML_APT_INSTALL git" ; declare LOCAL_PYTHON ; [ ! -z $LOCAL_PYTHON ] || for i in {15..5}; do which python3.$i && python3.$i -m pip --version && export LOCAL_PYTHON=$(which python3.$i) && break ; done ; [ ! -z $LOCAL_PYTHON ] || export CLEARML_APT_INSTALL="$CLEARML_APT_INSTALL python3-pip" ; [ -z "$CLEARML_APT_INSTALL" ] || (apt-get update -y ; apt-get install -y $CLEARML_APT_INSTALL) ; [ ! -z $LOCAL_PYTHON ] || export LOCAL_PYTHON=python3 ; $LOCAL_PYTHON -m pip install -U "pip==21.2.4" ; $LOCAL_PYTHON -m pip install -U clearml-agent ; cp /tmp/clearml.conf ~/default_clearml.conf ; NVIDIA_VISIBLE_DEVICES=all $LOCAL_PYTHON -u -m clearml_agent execute --disable-monitoring --id 089d2a5d2b59443db53dfb5a884eb7f3']
and I run agent from local user and I would expect that settings to have effect -v /home/localuser/.ssh:/home/testuser/.ssh
It does not map it directly, it creates a temp copy in the host /tmp folder of the entire ".ssh" folder, than maps this folder inside the container:
https://github.com/allegroai/clearml-agent/blob/a5a797ec5e5e3e90b115213c0411a516cab60e83/clearml_agent/commands/worker.py#L3422
Notice that the "docker_internal_mounts" section is nested inside the "agent" section in the clearml.conf
file.
Can you share the top Task log of a task your agent is running? (the first few lines contain the exact docker run command the agent is using)
GentleSwallow91 notice this part:
Hi Martin. Sorry - missed your reply.
Yeap I am aware that docker_internal_mounts is inside agent section.
'-v', '/tmp/ssh-XXXXXXnfYTo5/agent.8946:/tmp/ssh-XXXXXXnfYTo5/agent.8946', '-e', 'SSH_AUTH_SOCK=/tmp/ssh-XXXXXXnfYTo5/agent.8946',
It is creating a copy of the ssh folder and setting the SSH_AUTH_SOCK env to it. You can just map the entire ssh folder automatically by un-setting SSH_AUTH_SOCK before running the agent.SSH_AUTH_SOCK= clearml-agent ...
Yeap. It is configured this wayforce_git_ssh_protocol: true
But I don't see the mount of .ssh
One thing though - my container is running on behalf of non-root user.Here are my docker mounts: docker_internal_mounts { sdk_cache = /clearml_agent_cache # apt_cache = /var/cache/apt/archives ssh_folder = /home/testuser/.ssh pip_cache = /home/testuser/.cache/pip poetry_cache = /home/testuser/.cache/pypoetry vcs_cache = /home/testuser/.clearml/vcs-cache venv_build = /home/testuser/.clearml/venvs-builds pip_download = /home/testuser/.clearml/pip-download-cache }
Host user has a different name from testuser
Here are my extra_docker_arguments that make the thing working:
extra_docker_arguments: ["-v","/home/nino/.ssh:/home/testuser/.ssh", "--privileged"]
Here are my extra_docker_arguments that make the thing working:
GentleSwallow91 Nice!
BTW: in theory there should not need to be any need to add the specific: "-v","/home/nino/.ssh:/home/testuser/.ssh", the agent should do that automatically