
Reputation
Badges 1
27 × Eureka!After a restart, that seems to have helped, thanks! 🙂
Now we just need to solve the compilation of the git source installation... would have nice to have the find-links
for the wheels, but I understand it's unstable in reproduce-ability.
I've only set the force_git_ssh_protocol
to false, but kept the force_git_ssh_protocol
/ force_git_ssh_protocol
- which are set to simply some port and 'git'. It didn't work unfortunately... could not connect to github with the port (connection timed out)
If I change or remove the port, I can't clone the whole project, so I don't even reach the installation of the detectron part.
A public one is easy:
https://github.com/facebookresearch/detectron2
The internal one is something like:
ssh://repo.our-domain.com:1234
We use that ssh so that we can easily access it, without the need of storing name/password, or without the need that each one who uses the code has to set up in advance their credentials in ENV vars and such...
Sorry - my bad.
It did not work.
When I don't set anything in the clearml.conf
- it took the repo from the cache. When I delete the cache, it can't get the repo any longer.
Yes, I think - however - that it is our over strict security policy, that don't let anyone to enter the repo. We're lucky that they let the developers see their code...
We have defined an SSH key for clearml, and it is also set in the /clearml-agent/.ssh/config
and it still can't clone it. So it must be some security issue internally...
Exactly - we need a mixed behavior!
We host in a private (self-hosed) git-Lab, which can only be cloned through the SSH, and we would like to import packages by compiling it from a public git-Hub (or by installing it using wheels with find-links).
Hi AgitatedDove14 , thanks for the quick response.
I didn't set git_host
, only force_git_ssh_protocol: true
force_git_ssh_port: ...
force_git_ssh_user: ...
If I understand correctly, git_host
/ git_user
/ git_password
are all for HTTP, and we're using the SSH to clone the project through the agent.
Thank you, SuccessfulKoala55 !
yes, we're "fighting" here to setup the ES on our local K8S through Rancher ( https://rancher.com/ ). Is it mandatory, for example, to label the node app=clearml?
and the storage class name (I hope that what you meant, SuccessfulKoala55 ) is ceph-c2-prod-rz01-cephfs
The version we're using is: 1.1.1-135 • 1.1.1 • 2.14
Hi SuccessfulKoala55 , thanks for assisting, yes we used the helm to install it. It isn't the latest version though. We installed it a month or two ago.
hmm... the volume is already attached - already used by clearml-fileserver ... so it fails on this
Thank you both so much for the efforts to fix it 🙂
One of my colleagues ran once some training, with tons of data in the git folder which was not .gitignored - so I suspect it's related to this.
I'll continue reporting if it happens again
I wonder what did we do to reach it, though... Could be we flooded it at some point.
I have also no idea how it happened.
I managed to redeploy it and it seems to be accessible now
Sure, with pleasure. However, we're using a self-hosted (on premise) version of ClearML...
I'm also currently in a similar process, and giving a shot to http://DAGster.io
Yup, these were the values I was missing! 🙏 Thank you so much!
I am very new to all these things, and I didn't even use helm chart (stupid me...) only after you asked, I checked about it, and saw I could have made use of it, and that it could have saved me so much time 😛
Well, we always get smarter by learning from these experiences. So, next time... 🙂
Hi SuccessfulKoala55 , thank you for the clarification. So how can I change it to locate the API on our k8s service? If it's nginx I guess I'll have to manually configure it...?
Ingress is enabled - I can't control the ports in Ingress, so I had to use the subdomain method.
Thanks! And how can I validate it? that it properly connects?
clearml.conf is the file that clearml-init
suppose to create, right?
I did not use Chart nor Helm.
We just don't have enough credentials here to use those. I had to install everything manually, service by service.
For us it is both - having the process/pipeline presented in a clear UI, and the ability to trigger it, e.g. every evening.
In addition, tools like Dagster offer code-organization, and a separation of the code itself from the data and the configuration. So that we can use the same data/ml pipeline for different use-cases.
The manual clearml.conf worked 🙂 thanks for this!!