Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Clearml-Session Question: I’M Using The Tool With An On-Prem Machine. Normal Tasks Are Being Executed Normally - But When Using

Clearml-session question:
I’m using the tool with an on-prem machine. normal tasks are being executed normally - but when using clearml-session I get error with SSH connection intermittently.
Sometimes it is working fine, but sometimes I get this error message SSH tunneling failed, retrying in 3 seconds
This is the config that I used

 clearml-session --project examples --queue default --jupyter-lab true --vscode-server false \
--remote-gateway <internal-ip> \
--skip-docker-network \
--docker gcr.io/deeplearning-platform-release/pytorch-gpu \
--username <ssh-username> \
--password <ssh-password> \
--verbose  

Clearml-session version 0.4.0

  
  
Posted one year ago
Votes Newest

Answers 13


I see now an interesting warning

2023-02-15 12:49:22,813 - clearml - WARNING - Could not retrieve remote configuration named 'SSH'
  
  
Posted one year ago

Hmm, any suggestion on making it more visible or on the interface ? (I mean deleting the cache file is always a solution, but it sounded quite painful to debug, hence the question)

  
  
Posted one year ago

The thing is - when I try to connect with normal SSH there are no issues

ssh user@ip 

I’m trying to connect for Mac to Linux @<1523701070390366208:profile|CostlyOstrich36>

Clearml-agent is installed on another machine in the internal network @<1523701205467926528:profile|AgitatedDove14>

  
  
Posted one year ago

Yes sure - this is what I see in the logs

> Setting up openssh-sftp-server (1:8.2p1-4ubuntu0.5) ...
> Setting up python3-distro (1.4.0-1) ...

Remote machine is ready
Setting up connection to remote session
Starting SSH tunnel
  
  
Posted one year ago

It cached my SSH parameters and finally after removing all of them it worked

  
  
Posted one year ago

It seems like the configuration is cached in a way even when you change the CLI parameters.

@<1523704461418041344:profile|EnormousCormorant39> nice!
Yes the configuration is cached so that after you set it once you can just call clearml-session again without all the arguments
What was the actual issue ? Should we add something to the printout?

  
  
Posted one year ago

I mean SSH through the terminal works fine.
The issue is with Clearml-session.

I tried to remove the username/password and remote-host yesterday but it ended up asking me for the password when connecting and not accepting it.

  
  
Posted one year ago

Sometimes it is working fine, but sometimes I get this error message

@<1523704461418041344:profile|EnormousCormorant39> can I assume there is a gateway at --remote-gateway <internal-ip> ?
Could it be that this gateway has some network firewall blocking some of the traffic ?
If this is all local network, why do you need to pass --remote-gateway ?

  
  
Posted one year ago

I finally figured out the issue.
It seems like the configuration is cached in a way even when you change the CLI parameters.
After adding explicit JSON with configuration I managed to run it

  
  
Posted one year ago

image

  
  
Posted one year ago

@<1523701205467926528:profile|AgitatedDove14> @<1523701070390366208:profile|CostlyOstrich36> Thanks for the help

  
  
Posted one year ago

2023-02-15 12:49:22,813 - clearml - WARNING - Could not retrieve remote configuration named 'SSH'

This is fine, it means it uses the default identity keys

The thing is - when I try to connect with normal SSH there are no issues

Now I'm lost, so when exactly do you see the issue ?

  
  
Posted one year ago

Hi @<1523704461418041344:profile|EnormousCormorant39> , is there any chance this could be indeed network related if it does manage to work sometimes?

Can you add a larger portion of the log with errors?

Also what type of machines are these? Linux to linux?

  
  
Posted one year ago
623 Views
13 Answers
one year ago
one year ago
Tags