Reputation
Badges 1
53 × Eureka!Just a quick suggestion since I have some more insight on the situation. Maybe you can look at Velero, it should be able to migrate data. If not you can simply create a new fresh install, scale everything to zero, then create some debug pod mounting old and new pvc and copy data between the two. More complex to say it than do it.
I suggest to exec into the pod and issue the command kubectl delete pod -l=CLEARML=agent-74b23a8f --namespace=clearml --field-selector=status.phase!=Pending,status.phase!=Running --output name
sp you can see the ouput from inside the pod. This should help understand what is going on with the command
ok so they are executed as expected
when tasks starts do you see clearml-id-* pod starting?
probably you will see it’s not capable of doing it and it should be related k8s config
k8s cluster can access ubuntu archive?
additionalConfigs: auth.conf: | auth { # Fixed users login credentials # No other user will be able to login fixed_users { enabled: true pass_hashed: false users: [ { username: "jane" password: "12345678" name: "Jane Doe" }, { username: "john" password: "12345678" name: "John Doe" }, ] } }
so you should be able to pass additional stuff in this field directly during Helm apply
later in the day I will push also a new clearml chart that will not contain anymore k8s glue since it’s now in clearml-agent chart, this is why I was suggesting to use that chart :)
I don’t think it’s related how agent talk with apiserver or fileserver. It’s more related the fact agent pod internal kubectl cannot contact kubernetes apiserver
but it;s just a quick guess, not sure if i’m right
because kubectl inside pod uses inpod method
in values.yaml I guess apiServerUrlReference is wrong
if it turns 503 it’s not network but something on top of it
just my two cents
ok the issue must be there, After first creation nothing is there
In fact it's the same we are applying to helm charts for k8s
Just to be sure we are in sync 😁
` ❯ clearml-task --version
ClearML launch - launch any codebase on remote machine running clearml-agent
usage: clearml-task [-h] [--version] [--project PROJECT] --name NAME [--repo REPO] [--branch BRANCH]
[--commit COMMIT] [--folder FOLDER] [--script SCRIPT] [--cwd CWD] [--args [ARGS [ARGS ...]]]
[--queue QUEUE] [--requirements REQUIREMENTS] [--packages [PACKAGES [PACKAGES ...]]]
[--docker DOCKER] [--docker_args DOCKER_ARGS]
...
from /
to /debug.ping
I'm going to ask an update to docs