Hi There! I Am Using A Custom Clearml Installed In K8S Using The Official Helm-Chart (With Some Modifications). I Am Trying To Set Up Training That Runs From An Engineer’S Local Laptop In The K8S Cluster Using Clearml-Task. The Single File Variant (E.G. T

Answered

Hi there! I am using a custom ClearML installed in k8s using the official HELM-chart (with some modifications). I am trying to set up training that runs from an engineer’s local laptop in the k8s cluster using clearml-task. The single file variant (e.g. train.py) works great, but when I try to use code stored in a GIT (Bitbucket) repo I got a repository cloning error, specifically: Host key verification failed.
Could you please advise where and how I can place the ssh-key, and where it should be specified in the agent’s configuration so that the command git clone None works? Unfortunately, I couldn’t find this in the documentation.
I mounted the ssh-key in /etc/ssh-key/id_rsa (via kubernetes secrets), but the system doesn’t see it.
Thanks in advance for your help!

P.S. Here is the command I am executing:
‘clearml-task --project CML_TEST --name hdd_test --repo None --branch master --script hdd/train.py --args trainer.max_epochs=5 trainer=ddp trainer.devices=1 trainer.num_nodes=3 --queue default’

And here is the error:
cloning: None
Host key verification failed. fatal: Could not read from remote repository.
Please make sure you have the correct access rights and the repository exists.
Thanks!

  				
Posted 
	one year ago

					More
				  		
  Report
		
					HighKitten20
				
					0
					 × 1

Votes Newest

Answers 5

Hi @<1743079861380976640:profile|HighKitten20>

but when I try to use code stored in a GIT (Bitbucket) repo I got a repository cloning error, specifically

did you pass configure the git repo application/pass here: None

  				
Posted 
	one year ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hi @<1523701205467926528:profile|AgitatedDove14> ,
I set force_git_ssh_protocol: true (leaving git_user and git_pass empty) because I want to use ssh_key-based authentication.
The problem is that even when I mount the SSH key into the root home directory (e.g., /root/.ssh/id_rsa with the correct permissions set to 400) I still encounter the same error. I suspect that I might be doing something wrong, or perhaps the open-source version does not support this configuration. Could this be the case?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					HighKitten20
				
					0
					 × 1

OK, I managed to do this via extra_docker_shell_script option in values.yaml:

clearmlConfig: |-
    sdk {
    }
    agent {
        force_git_ssh_protocol: true
        git_user: ""
        git_pass: ""
        extra_docker_shell_script: ["mkdir -p /root/.ssh", "cp /etc/ssh-key/..data/id_ed25519 /root/.ssh/", "ssh-keyscan -H bitbucket.mycompany.com >> /root/.ssh/known_hosts"]
       
    }

And now I have another issue - how can I run pip install -e . command? Is that possible to install packages from a local project path?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					HighKitten20
				
					0
					 × 1

The problem is that even when I mount the SSH key into the root home directory (e.g.,

/root/.ssh/id_rsa

with the correct permissions set to 400) I still encounter the same error.

The agent automatically mount's the .ssh folder from the host into the container, making sure all the permissions are set,

how can I run

pip install -e .

in general the agent will add the "working" dir into the PYTHONPATH so that you should not have to manually do "-e ."
That said if you must have it , just add -e . to your "installed packages" section, should work 🤞

  				
Posted 
	one year ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hi Martin! About pip install -e . - I had that made through agent.package_manager.extra_pip_install_flags flag like below:

agent {
   package_manager: {
          extra_pip_install_flags: ["-e", "." ],
        }
}

Thanks for your help!

  				
Posted 
	one year ago

					More
				  		
  Report
		
					HighKitten20
				
					0
					 × 1

Write your answer

2K Views

5 Answers

one year ago