Unanswered
Hi, I Have Run K8S_Glue_Example.Py On My On-Prem K8S, And Have Preconfigured Nodeport Services. I Succeeded To Use Clearml Session To Create Pods But The Ssh Tunneling Failed. It Tried To Connect Clusterip Of The Pod And Port 10020 Instead Of Node Ip And
Hi SuccessfulKoala55 , Even I run the clearml-session with command line option --remote-ssh-port
and --remote-gateway
the SSH tunneling still failed.
Following is my complete step:
- set k8s service with the following yml:
kind: Service
apiVersion: v1
metadata:
name: clearml-agent-1-nodeprot
namespace: clearml
spec:
ports:
- name: clearml-agent-ssh
port: 10022
targetPort: 10022
nodePort: 31919
type: NodePort
selector:
ai.allegro.agent.serial: pod-1
- run
python k8s_glue_example.py --queue gpu-1 --ports-mode --template-yaml gpu-1.yml
on k8s node. - run
clearml-session --docker nvidia/cuda:11.0.3-runtime-ubuntu20.04 --remote-gateway 10.190.253.18 --remote-ssh-port 31919
on my PC.10.190.253.18
is the node IP that session pod running. - the clearml-session log on my PC:
Remote machine is ready
Setting up connection to remote session
Starting SSH tunnel to root@10.190.253.18, port 31919
SSH tunneling failed, retrying in 3 seconds
Starting SSH tunnel to root@10.190.253.18, port 31919
.......
Could you provide a complete example or tutorial?
230 Views
0
Answers
one year ago
one year ago