Reputation
Badges 1
53 × Eureka!btw a good practice is to keep infrastructural stuff decoupled from applications. What about using https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner ? After applying that chart you can simply use the generated storage class; wdyt?
probably you will see it’s not capable of doing it and it should be related k8s config
yes, exactly, agent creates and manages task pod lifecycle
ok got it, are you able to access the system bypassing nginx with http://<Server Address>:8080 ?
it would be great to get logs from apiserver and fileserver pods when deleting a file from ui so we can see what is going on. I’m saying this because, at first glance, I don’t see anyissue in your config
Today I’m OOO but I. An give an initial suggestion: when dealing with resource usage issues logs are important but metrics can help a lot more. If you don’t have it, install a Grafana stack so we can see resource metric history before we got oom . This helps to understand if we are really using a lot of RAM ore the problem is somewhere else.
if you do a kubectl get svc in namspace you should see the svc of api webserver and fileserver
accessing apiserver from a pod doesn’t require kubeconfig
there are workarounds tbh but they are tricks that require a lot of k8s espertise and they are risky
if it will not be updated and CI passed, I will have to create a new one when possible but I don’t have a timeframe for now
you will need to upgrade clearml helm chart
Ty, I have other stuff that I'd like to send but it's better to get these eventually merged first so I can proceed to shiny news PR in the near future 😄
Hi everyone, I just fixed releases so new charts containing this fix are published. ty!
our data engineer directly write code in pycharm and test it on the fly with brakpoints. when good we simply commit in git and we set a tag "prod ready"
can you also show output of kubectl get po of the namespace where you installaed clearml?
I can add some configurable value then ASAP 👍 will do in next days
Hi BeefyHippopotamus73 , on EKS it’s preferrable to use ALB but you can also work with your nginx. You need DNS records with hostnames you setup pointing to that External IP. If you just need to test, you can simply add entries in you client machine /etc/hosts file (if you are on *nix)
ok, will try to find a solution then, ty
Interesting use case, maybe we can create multiple k8s agents for different queues
can you post output ofkubectl get po -A -n clearmlpls?
If I'm not wrong you can simply label the namespace to avoid istio to get there
elastic is not being scheduled
this is the PR https://github.com/allegroai/clearml-helm-charts/pull/66 ; when CI will be passed I’m going to release the new chart