Unanswered
Hi, I Have Run K8S_Glue_Example.Py On My On-Prem K8S, And Have Preconfigured Nodeport Services. I Succeeded To Use Clearml Session To Create Pods But The Ssh Tunneling Failed. It Tried To Connect Clusterip Of The Pod And Port 10020 Instead Of Node Ip And
Hi @<1523701087100473344:profile|SuccessfulKoala55> , Even I run the clearml-session with command line option --remote-ssh-port
and --remote-gateway
the SSH tunneling still failed.
Following is my complete step:
- set k8s service with the following yml:
kind: Service
apiVersion: v1
metadata:
name: clearml-agent-1-nodeprot
namespace: clearml
spec:
ports:
- name: clearml-agent-ssh
port: 10022
targetPort: 10022
nodePort: 31919
type: NodePort
selector:
ai.allegro.agent.serial: pod-1
- run
python k8s_glue_example.py --queue gpu-1 --ports-mode --template-yaml gpu-1.yml
on k8s node. - run
clearml-session --docker nvidia/cuda:11.0.3-runtime-ubuntu20.04 --remote-gateway 10.190.253.18 --remote-ssh-port 31919
on my PC.10.190.253.18
is the node IP that session pod running. - the clearml-session log on my PC:
Remote machine is ready
Setting up connection to remote session
Starting SSH tunnel to root@10.190.253.18, port 31919
SSH tunneling failed, retrying in 3 seconds
Starting SSH tunnel to root@10.190.253.18, port 31919
.......
Could you provide a complete example or tutorial?
165 Views
0
Answers
one year ago
one year ago