Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hey There! I’M Having A Problem With Clearml-Sessions, Maybe Someone Had A Similar Problem Already: I’M Running An Agent In Docker Mode On A Remote Machine. When I Run

Hey there! Iā€™m having a problem with clearml-sessions, maybe someone had a similar problem already:
Iā€™m running an agent in docker mode on a remote machine. When I run clearml-session on my local machine a jupyterLab, a VSCode-Server and SSH on port 10022 gets spun up. Unfortunately Iā€™m getting the following output (10.10.10.10 serves as example ip):
` Starting SSH tunnel
Warning: Permanently added '[10.10.10.10]:10022' (ED25519) to the list of known hosts.

SSH tunneling failed, retrying in 3 seconds However, I can connect to the remote machine via ssh root@10.10.10.10 -p 10022 without a problem I can access the jupyterLab and VSCode-Server via 127.0.0.1:8888 / 127.0.0.1:9000 when I manually ssh tunnel f.e. via ssh -N -L 8888:127.0.0.1:8888 root@10.10.10.10 -v -v -p 10022 `Has anybody experienced something similar and got it to work without manually ssh tunneling to the clearml-agent?

  
  
Posted 2 years ago
Votes Newest

Answers 9


Hi BitingKangaroo95
Are you running the agent on docker-mode or venv mode ?
basically, clearml-session will work on on clearml-agents that are running in docker mode
(I think we already have a fix for the documentation, probably will be deployed soon)

  
  
Posted 2 years ago

Hi Martin! Iā€™m running the agent in docker-mode šŸ™‚

  
  
Posted 2 years ago

and both works when I ignore the SSH tunneling failed, retrying in 3 seconds - warning and tunnel manually

  
  
Posted 2 years ago

Thank you for saying ! šŸ˜

  
  
Posted 2 years ago

BitingKangaroo95 nice work šŸŽŠ
I think that what did it was:
change the sshd_config so that it allows port forwarding , agent forwarding and x11 forwardingBut just in case, it might be there was a pre existing SSH identifier on your machine, and hence the error.
clear known_hosts under ~/.ssh was also something I would try šŸ™‚

  
  
Posted 2 years ago

BitingKangaroo95 can you post here the entire console output of clearml-session (including full command line) ?

  
  
Posted 2 years ago

sure AgitatedDove14
cleaml-session:
` $ clearml-session --queue session-test ī‚² āœ”
clearml-session - CLI for launching JupyterLab / VSCode on a remote machine
Verifying credentials
Use previous queue (resource) 'session-test' [Y]/n? y

Interactive session config:
{
"base_task_id": null,
"git_credentials": false,
"jupyter_lab": true,
"keepalive": false,
"password": "263d75e1e32c855893740d23ec000c5abfb9d3c4d50506ab87229d058ae740d5",
"queue": "session-test",
"vscode_server": true
}

Launch interactive session [Y]/n? y
Removing stale interactive sessions
Creating new session
Configuring new session
New session created [id=ac61d1b0327d45e4aab065df5e84f8c6]
Waiting for remote machine allocation [id=ac61d1b0327d45e4aab065df5e84f8c6]
.Status [queued]
.Status [in_progress] - queued pulled by agent
Remote machine allocated
Setting remote environment [Task id=ac61d1b0327d45e4aab065df5e84f8c6]
Setup process details:
Waiting for environment setup to complete [usually about 20-30 seconds, see last log line/s below]

Executing: ['docker', 'run', '-t', '--gpus', '"device=0"', '--network', 'host', '-l', 'clearml-worker-id=clearml-agent:gpu0', '-l', 'clearml-parent-worker-id=clearml-agent:gpu0', '-e', 'CLEARML_WORKER_ID=clearml-agent:gpu0', '-e', 'CLEARML_DOCKER_IMAGE=nvidia/cuda:10.1-runtime-ubuntu18.04 --network host', '-e', 'CLEARML_TASK_ID=ac61d1b0327d45e4aab065df5e84f8c6', '-v', '/tmp/.clearml_agent.sfr5_g31.cfg:/tmp/clearml.conf', '-e', 'CLEARML_CONFIG_FILE=/tmp/clearml.conf', '-v', '/tmp/clearml_agent.ssh.8h4rsdwk:/root/.ssh', '-v', '/root/.clearml/apt-cache:/var/cache/apt/archives', '-v', '/root/.clearml/pip-cache:/root/.cache/pip', '-v', '/root/.clearml/pip-download-cache:/root/.clearml/pip-download-cache', '-v', '/root/.clearml/cache:/clearml_agent_cache', '-v', '/root/.clearml/vcs-cache:/root/.clearml/vcs-cache', '--rm', 'nvidia/cuda:10.1-runtime-ubuntu18.04', 'bash', '-c', 'echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/docker-clean ; chown -R root /root/.cache/pip ; export DEBIAN_FRONTEND=noninteractive ; export CLEARML_APT_INSTALL="$CLEARML_APT_INSTALL libsm6 libxext6 libxrender-dev libglib2.0-0" ; [ ! -z $(which git) ] || export CLEARML_APT_INSTALL="$CLEARML_APT_INSTALL git" ; declare LOCAL_PYTHON ; [ ! -z $LOCAL_PYTHON ] || for i in {15..5}; do which python3.$i && python3.$i -m pip --version && export LOCAL_PYTHON=$(which python3.$i) && break ; done ; [ ! -z $LOCAL_PYTHON ] || export CLEARML_APT_INSTALL="$CLEARML_APT_INSTALL python3-pip" ; [ -z "$CLEARML_APT_INSTALL" ] || (apt-get update -y ; apt-get install -y $CLEARML_APT_INSTALL) ; [ ! -z $LOCAL_PYTHON ] || export LOCAL_PYTHON=python3 ; $LOCAL_PYTHON -m pip install -U "pip<20.2" ; $LOCAL_PYTHON -m pip install -U clearml-agent ; cp /tmp/clearml.conf ~/default_clearml.conf ; NVIDIA_VISIBLE_DEVICES=all $LOCAL_PYTHON -u -m clearml_agent execute --disable-monitoring --id> Installing collected packages: wcwidth, prompt-toolkit, decorator, ipython-genutils, traitlets, nest-asyncio, entrypoints, pyzmq, jupyter-core, tornado, jupyter-client, parso, jedi, backcall, pickleshare, ptyprocess, pexpect, pygments, ipython, ipykernel, jupyter-console, nbformat, MarkupSafe, jinja2, mistune, defusedxml, jupyterlab-pygments, webencodings, packaging, bleach, pandocfilters, async-generator, nbclient, testpath, nbconvert, terminado, prometheus-client, pycparser, cffi, argon2-cffi-bindings, dataclasses, argon2-cffi, Send2Trash, notebook, qtpy, qtconsole, widgetsnbextension, jupyterlab-widgets, ipywidgets, jupyter, websocket-client, immutables, contextvars, sniffio, anyio, jupyter-server, pytz, babel, json5, jupyterlab-server, nbclassic, jupyterlab, jupyter-server-mathjax, colorama, smmap, gitdb, GitPython, nbdime, jupyterlab-git, tomli, wrapt, lazy-object-proxy, typed-ast, astroid, mccabe, dill, isort, pylint, nump acking code-server (3.12.0) ...gf4JFzvRAqSTbfkyWQ84 root@clearml-agent (ED25519)

Remote machine is ready
Setting up connection to remote session
Starting SSH tunnel
Warning: Permanently added '[10.10.10.10]:10022' (ED25519) to the list of known hosts.

SSH tunneling failed, retrying in 3 seconds
Starting SSH tunnel
Warning: Permanently added '[10.10.10.10]:10022' (ED25519) to the list of known hosts.

SSH tunneling failed, retrying in 3 seconds Output from clearml-agent in docker-mode:

SSH Server running on clearml-agent [10.10.10.10] port 10022

LOGIN u:root p:263d75e1e32c855893740d23ec000c5abfb9d3c4d50506ab87229d058ae740d5

Selecting previously unselected package code-server.
(Reading database ... 21572 files and directories currently installed.)
Preparing to unpack .../1f4afee1eafd22d8658b62e8646d4e5f.code-server_3.12.0_amd64.deb ...
Unpacking code-server (3.12.0) ...
Setting up code-server (3.12.0) ...
Running VSCode Server on clearml-agent [10.10.10.10] port 9000 at /root/
VSCode Server available:

Running Jupyter Notebook Server on clearml-agent [10.10.10.10] port 8888 at /root/
/root/.clearml/venvs-builds/3.6/lib/python3.6/site-packages/jupyter_server_mathjax/app.py:40: FutureWarning: The alias _() will be deprecated. Use _i18n() instead.
help=_("""The MathJax.js configuration file that is to be used."""),
[I 2022-07-22 10:28:59.092 ServerApp] jupyter_server_mathjax | extension was successfully linked.
[I 2022-07-22 10:28:59.100 ServerApp] jupyterlab | extension was successfully linked.
[I 2022-07-22 10:28:59.101 ServerApp] jupyterlab_git | extension was successfully linked.
[I 2022-07-22 10:28:59.108 ServerApp] Writing Jupyter server cookie secret to /root/.local/share/jupyter/runtime/jupyter_cookie_secret
[I 2022-07-22 10:28:59.285 ServerApp] nbclassic | extension was successfully linked.
[I 2022-07-22 10:28:59.285 ServerApp] nbdime | extension was successfully linked.
[I 2022-07-22 10:28:59.306 ServerApp] nbclassic | extension was successfully loaded.
[I 2022-07-22 10:28:59.306 ServerApp] jupyter_server_mathjax | extension was successfully loaded.
[I 2022-07-22 10:28:59.307 LabApp] JupyterLab extension loaded from /root/.clearml/venvs-builds/3.6/lib/python3.6/site-packages/jupyterlab
[I 2022-07-22 10:28:59.307 LabApp] JupyterLab application directory is /root/.clearml/venvs-builds/3.6/share/jupyter/lab
[I 2022-07-22 10:28:59.310 ServerApp] jupyterlab | extension was successfully loaded.
[I 2022-07-22 10:28:59.315 ServerApp] jupyterlab_git | extension was successfully loaded.
[I 2022-07-22 10:28:59.367 ServerApp] nbdime | extension was successfully loaded.
[I 2022-07-22 10:28:59.368 ServerApp] Serving notebooks from local directory: /root
[I 2022-07-22 10:28:59.368 ServerApp] Jupyter Server 1.13.1 is running at:
[I 2022-07-22 10:28:59.368 ServerApp]
[I 2022-07-22 10:28:59.368 ServerApp] or
[I 2022-07-22 10:28:59.368 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 2022-07-22 10:28:59.372 ServerApp]

To access the server, open this file in a browser:
    file:///root/.local/share/jupyter/runtime/jpserver-3341-open.html
Or copy and paste one of these URLs:
     ` ` 
 or  ` ` 

Jupyter Lab URL: `

  
  
Posted 2 years ago

Thanks Martin; Iā€™m really enyjoying clearML so far. Especially the function to execute code remotely makes it a real great product šŸŽ‰

  
  
Posted 2 years ago

Thanks for your time, Martin! It is now working under the premise that I
start the clearml-agent in docker-mode as root user change the sshd_config so that it allows port forwarding , agent forwarding and x11 forwarding start clearml-session on my local machine as root

  
  
Posted 2 years ago
1K Views
9 Answers
2 years ago
one year ago
Tags