Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

Hi, I am giving another try to clearml-session and I am blocked at the current error shown when the CLI try to establish the tunneling:
Starting SSH tunnel Warning: Permanently added '[10.xx.xx.xx]:xxxx' (ED25519) to the list of known hosts. root@10.xx.xx.xx: Permission denied (publickey).How can I solve that? (I am using latest version of clearml-session)

  
  
Posted 2 years ago
Votes Newest

Answers 30


This is the prerequisites of the docker service installed on the host machine (where the agent is running)
Basically follow: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html
https://docs.docker.com/compose/gpu-support/

  
  
Posted 2 years ago

JitteryCoyote63 this is standard ssh authorized server removal
https://superuser.com/a/30089
specifically you can try:
ssh-keygen -R 10.105.1.77

  
  
Posted 2 years ago

AgitatedDove14 https://clear.ml/docs/latest/docs/apps/clearml_session/#running-in-docker in the docs there is a --docker option, that’s what confuses me, since the agent should always run in docker mode

  
  
Posted 2 years ago

Is it being used to ssh to the instance?

It is used for the SSH client so it "knows" the SSH server (does that make sense) ?

  
  
Posted 2 years ago

sorry, the clearml-session. The error is the one I shared at the beginning of this thread

  
  
Posted 2 years ago

Hi JitteryCoyote63 , can I assume you can ssh into the machine directly?

  
  
Posted 2 years ago

I understand, but then why the docker mode is an option of the CLI if we always have to use it so that it works?

  
  
Posted 2 years ago

No

  
  
Posted 2 years ago

(BTW: it will work with elevated credentials, but probably not recommended)

What does that mean? Not sure to understand

  
  
Posted 2 years ago

But after that you're connected to the machine and can work on it?

  
  
Posted 2 years ago

Well not really

  
  
Posted 2 years ago

So I cannot ssh anymore to the agent after starting clearml-session on it

  
  
Posted 2 years ago

Yes!

  
  
Posted 2 years ago

I'm not sure, will check 🙂

  
  
Posted 2 years ago

Sorry, what I meant is that it is not documented anywhere that the agent should run in docker mode, hence my confusion

This is a good point! I'll make sure we stress it (BTW: it will work with elevated credentials, but probably not recommended)

  
  
Posted 2 years ago

CostlyOstrich36 How is clearml-session setting the ssh config?

  
  
Posted 2 years ago

but I still clearml-agent will raise the same error

which one?

  
  
Posted 2 years ago

AgitatedDove14 Yes with the command you shared I can now ssh again manually to the agent, but I still clearml-agent will raise the same error

  
  
Posted 2 years ago

Well not really

Please elaborate 🙂

  
  
Posted 2 years ago

So this message appears when I try to ssh directly into the instance

  
  
Posted 2 years ago

Does the agent install the nvidia-container toolkit, so that GPUs of the instance can be accessed from inside the docker running jupyterlab?

  
  
Posted 2 years ago

AgitatedDove14 I see https://github.com/allegroai/clearml-session/blob/main/clearml_session/interactive_session_task.py#L21= that a key pair is hardcoded in the repo. Is it being used to ssh to the instance?

  
  
Posted 2 years ago

This is the reason you are getting an error 🙂
Basically the session asks the agent to setup a new SSH server with credentials on the remote machine, this is not an issue inside a container, as this is an isolated environment, but when running in venv mode the User running the agent is not root, hence it cannot spin/configure an SSH server.
Make sense ?

  
  
Posted 2 years ago

Yes, the agent's mode is global, i.e. all tasks are either inside docker or in venv. In theory you can have two agents on the same machine one venv one docker listening to two diff queues

  
  
Posted 2 years ago

That’s why I said “not really” 😄

  
  
Posted 2 years ago

After I started clearml-session

  
  
Posted 2 years ago

JitteryCoyote63 are you running the agent in docker mode ?

  
  
Posted 2 years ago

Sorry, what I meant is that it is not documented anywhere that the agent should run in docker mode, hence my confusion

  
  
Posted 2 years ago

If I don’t start clearml-session , I can easily connect to the agent, so clearml-session is doing something that messes up the ssh config and prevent me from ssh into the agent afterwards

  
  
Posted 2 years ago

ssh my-instance @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! Someone could be eavesdropping on you right now (man-in-the-middle attack)! It is also possible that a host key has just been changed. The fingerprint for the ED25519 key sent by the remote host is SHA256:O2++ST5lAGVoredT1hqlAyTowgNwlnNRJrwE8cbMLo0. Please contact your system administrator. Add correct host key in /Users/H4dr1en/.ssh/known_hosts to get rid of this message. Offending ECDSA key in /Users/H4dr1en/.ssh/known_hosts:81 Host key for 10.105.1.77 has changed and you have requested strict checking. Host key verification failed.

  
  
Posted 2 years ago
2K Views
30 Answers
2 years ago
one year ago
Tags