Reputation
Badges 1
53 × Eureka!you can create a specific config like one in https://clear.ml/docs/latest/docs/integrations/storage/
I finally found where the issue is, I opened an issue on gh so it's more manageable:
https://github.com/allegroai/clearml/issues/273
I also found a not so good (at least for me) behaviour:
https://github.com/allegroai/clearml/issues/272
ya sure, I was referring to. create a new PVC just for the test
some suggestions:
start working just with clearml (no agent or serving, these ones will go in after clearml is working) try a fist deploy without any override if it works start adding values to override file (without reporting everything or it will be very difficult to debug, you should not report on override file what is not overridden) do helm upgrade check problems one by one
this one should not be needed for asyncdelete, what is the error you are getting?
(and any queue has it’s only basepodtemplate)
From k8s perspective a pod is ephemeral so if it’s gone for any reason it’s gone. Obviously there are structures that can ensure running state (like Deployments or Statefulsets) so if a pod dies, another one takes place. We didn;t go in this direction because pods are not idempotent so it’s not straightfoward to simply replace them. Btw this looks an interesting topic to me so I’d like to include SuccessfulKoala55 on this also because i’m involved more in infra side of the equation and I ma...
then I enqueue it and it's created but obv empty
You can’t write on readonly replica is about MongoDB. I guess you are using a multiple replica setup. In this case the mongodb dependency chart have a lot of parameters to tweak the system and maybe also an arbiter is good for you. But this is a huge topic regarding mongodb specific k8s setups.
It happened to me when trying many installations; can you login using http://app.clearml.home.ai/login url directly ?
Just a quick suggestion since I have some more insight on the situation. Maybe you can look at Velero, it should be able to migrate data. If not you can simply create a new fresh install, scale everything to zero, then create some debug pod mounting old and new pvc and copy data between the two. More complex to say it than do it.
if you do a kubectl get svc in namspace you should see the svc of api webserver and fileserver
I don’t think you need to pass these env vars in extraenvs, references are automatically generated by chart. After removing them, pls post webserver pod logs here and let’s see if we can spot the issue, ty.
if mounts are already there everywhere you can also mount directly on the nodes on a specific folder then use rancher local path provisioner
btw a good practice is to keep infrastructural stuff decoupled from applications. What about using https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner ? After applying that chart you can simply use the generated storage class; wdyt?
other wise yes, if this is not an option, you can also mount what is already existing so pls open an issue in new repo helm chart and we can find a solution
Hi BurlySeagull48 , I’m interested in your use case and I think we can find a solution. NFS mounts have the same path in every node?
if you already have data over there you may import it
I think we can find a solution pretty quickly after some checks. Can you pls open an issue on new helm chart repo so I can take care of it in some day?
Or do you want to dinamically mount directly an nfs endpoint? (I understood you need this one)
ok, will try to find a solution then, ty
O k, I’d like to test it more with you; credentials exposed in chart values are system ones and it’s better to not change them; let’s forget about them for now. If you create a new accesskey/secretkey pair in ui, you should use these ones in your agents and they shuld not get overwritten in any way; can you confirm it works without touching credentials
section?
y ou can siply generate in UI the keys