Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi Fam! I’M Trying To Get

Hi fam! I’m trying to get clearml-session working but getting stuck on something probably basic. I’ve got an EC2 instance configured and running clearml-agent , and I can see it in the ClearML server. When I run clearml-session from a client, the agent discovers the task and starts up VScode/Jupyter successfully. However on the client side, I just get this printed repeatedly:
SSH tunneling failed, retrying in 3 seconds
The instance is open to all IP address on all ports for outbound traffic, and open to all IP addresses on ports 22, 8080 and 9000 for inbound traffic. Is there anything else I might be missing?

  
  
Posted 2 years ago
Votes Newest

Answers 8


Hi AgitatedDove14 thanks for your help and sorry I missed this! I’ve had this on hold for the last few days, but I’m going to try firing up a new ClearML server running Version 1.02 (I’ve been using the slightly older Allegro Trains image from the AWS marketplace) and have another try from there. Thanks for your help on Github too ❤ I’m so blown away by the quality of everything you folks are doing, have been championing it hard at my workplace

  
  
Posted 2 years ago

Oh that’s cool, I assumed the DevOps project was just examples!

There’s a jupyter_url property there that is http://{instance's_private_ip_address}:8888?token={jupyter_token}

There’s also
external_address {instance_public_ip_address} internal_ssh_port 10022 internal_stable_ssh_port 10023 jupyter_port 8888 jupyter_token {jupyter_toke} vscode_port 9000
Maybe this is something stupid to do with VPCs that I should understand better!

  
  
Posted 2 years ago

Update: I see that by default it uses 10022 as the remote SSH port, so I’ve opened that as well (still getting the “tunneling failed” message though).

I’ve also noticed this log in the agent machine:
2021-07-09 05:38:37,766 - clearml - WARNING - Could not retrieve remote configuration named 'SSH' Using default configuration: {'ssh_host_ecdsa_key': '-----BEGIN EC PRIVATE KEY-----\{private key here}

  
  
Posted 2 years ago

Totally! Thanks so much AgitatedDove14 , I’ll try that out now

  
  
Posted 2 years ago

Hi QuaintPelican38 can you manually access the machine based on the IP it registered
(Look under the DevOps project, you'll see a running Task "interactive session" under the configuration tab, user properties you should find the IP

  
  
Posted 2 years ago

Hi QuaintPelican38
Can you ssh to {instance_public_ip_address}:10022 (something like ssh -p 10022 user@IP_HERE )?
Basically just getting the password prompt means you are okay.
I suspect that you have some AWS security definition (firewall) that prevents a direct access to the instance, could that be?

  
  
Posted 2 years ago

Unfortunately no dice 😕 I’ve opened every port from 0-11000, and am using the command clearml-session --public-ip true on the client, but still getting the timeout message, only now it says:
` Setting up connection to remote session
Starting SSH tunnel
Warning: Permanently added '[<IP address>]:10022' (ECDSA) to the list of known hosts.

SSH tunneling failed, retrying in 3 seconds
Starting SSH tunnel
Warning: Permanently added '[<IP address>]:10022' (ECDSA) to the list of known hosts. `And it repeats that second paragraph every 3 seconds

  
  
Posted 2 years ago

Hi QuaintPelican38
Assuming you have open the default SSH port 10022 on the ec2 instance (and assuming the AWS premissions are set so that you can access it). You need to use the --public-ip flag when running the clearml-session. Otherwise it "thinks" it is running on a local network and it registers itself with the local IP. With the flag on it gets the public IP of the machine, then the clearml-session running on your machine can connect to it.
Make sense ?

  
  
Posted 2 years ago
694 Views
8 Answers
2 years ago
one year ago
Tags