Badges 127 × Eureka!
Thanks! And how can I validate it? that it properly connects?
For us it is both - having the process/pipeline presented in a clear UI, and the ability to trigger it, e.g. every evening.
In addition, tools like Dagster offer code-organization, and a separation of the code itself from the data and the configuration. So that we can use the same data/ml pipeline for different use-cases.
The manual clearml.conf worked 🙂 thanks for this!!
I wonder what did we do to reach it, though... Could be we flooded it at some point.
I'll continue reporting if it happens again
Hi SuccessfulKoala55 , thanks for assisting, yes we used the helm to install it. It isn't the latest version though. We installed it a month or two ago.
The version we're using is: 1.1.1-135 • 1.1.1 • 2.14
Hi SuccessfulKoala55 , thank you for the clarification. So how can I change it to locate the API on our k8s service? If it's nginx I guess I'll have to manually configure it...?
clearml.conf is the file that
clearml-init suppose to create, right?
Ingress is enabled - I can't control the ports in Ingress, so I had to use the subdomain method.
Thank you both so much for the efforts to fix it 🙂
One of my colleagues ran once some training, with tons of data in the git folder which was not .gitignored - so I suspect it's related to this.
I'm also currently in a similar process, and giving a shot to http://DAGster.io
and the storage class name (I hope that what you meant, SuccessfulKoala55 ) is
I have also no idea how it happened.
I managed to redeploy it and it seems to be accessible now
Sure, with pleasure. However, we're using a self-hosted (on premise) version of ClearML...
Thanks! I'll check it 🙂
hmm... the volume is already attached - already used by clearml-fileserver ... so it fails on this
Yup, these were the values I was missing! 🙏 Thank you so much!
I am very new to all these things, and I didn't even use helm chart (stupid me...) only after you asked, I checked about it, and saw I could have made use of it, and that it could have saved me so much time 😛
Well, we always get smarter by learning from these experiences. So, next time... 🙂
After a restart, that seems to have helped, thanks! 🙂
Now we just need to solve the compilation of the git source installation... would have nice to have the
find-links for the wheels, but I understand it's unstable in reproduce-ability.
I've only set the
force_git_ssh_protocol to false, but kept the
force_git_ssh_protocol - which are set to simply some port and 'git'. It didn't work unfortunately... could not connect to github with the port (connection timed out)
If I change or remove the port, I can't clone the whole project, so I don't even reach the installation of the detectron part.
Hi AgitatedDove14 , thanks for the quick response.
I didn't set
git_host , only
If I understand correctly,
git_password are all for HTTP, and we're using the SSH to clone the project through the agent.
Exactly - we need a mixed behavior!
We host in a private (self-hosed) git-Lab, which can only be cloned through the SSH, and we would like to import packages by compiling it from a public git-Hub (or by installing it using wheels with find-links).
Yes, I think - however - that it is our over strict security policy, that don't let anyone to enter the repo. We're lucky that they let the developers see their code...
We have defined an SSH key for clearml, and it is also set in the
/clearml-agent/.ssh/config and it still can't clone it. So it must be some security issue internally...
We use that ssh so that we can easily access it, without the need of storing name/password, or without the need that each one who uses the code has to set up in advance their credentials in ENV vars and such...
Sorry - my bad.
It did not work.
When I don't set anything in the
clearml.conf - it took the repo from the cache. When I delete the cache, it can't get the repo any longer.
I did not use Chart nor Helm.
We just don't have enough credentials here to use those. I had to install everything manually, service by service.