
Reputation
Badges 1
137 × Eureka!do I need something else in the clearml.conf?
Thanks Valeriano, so copying the .kube/config file from a node from which I can run kubectl I could run kubectl commands correctly
I am not aware of how clearml-dataset works, but I'll have a look 🙂
but I can confirm that adding the requirement with Task.add_requirements()
does the trick
but I was a bit set off track seeing errors in the logs
I see in bitnami's gh-pages branch a file https://github.com/bitnami-labs/sealed-secrets/blob/gh-pages/index.html to do the redirect that contains:
` <html>
<head> <meta http-equiv="refresh" content="0; url= ` ` "> </head> <p><a href=" ` ` ">Redirect to repo index.yaml</a></p> </html> ` A similar file is missing in the ` clearml-helm-chart ` ` gh-pages ` branch.
great! thanks a lot!
so I assume clearml moves them from one queue to the other?
Ah sorry, I thought what where the names of the queue I created like (in case I used some weird character or stuff like that)
In the ClearML ui it stays in a Pending state
Hi Alon, thanks, I actually watched those videos. But they don't help with settings things up 🙂
From your explanation, I understand that Agents are indeed needed for ClearML to work.
but I don't understand the comment on GPUs as the documentation makes a lot of references on GPU configurations for agents
but I set up only the apiserver fileserver and webserver hosts, and the access keys... the rest is what is produced by clearml-init
AgitatedDove14 I used the default configuration from the helm chart for the k8s glue.
The way I understand it is that K8s glue agent is enabled by default (and I do see a Deployment for clearml-k8sagent
especially for datasets (for the models and other files we were thinking to use the fileserver any way)
But as Gaspard was saying, with the default config there is no agent listening to the "k8s_scheduler" queue with the default settings
Thanks Martin! If I end up having sometime I'll dig into the code and check if I can bake something!
because while I can run kubectl commands from within the agent pod, clearml doesn't seem to pick the right value:
` 2022-08-05 12:09:47
task 29f1645fbe1a4bb29898b1e71a8b1489 pulled from 51f5309bfb1940acb514d64931ffddb9 by worker k8s-agent-cpu
2022-08-05 12:12:59
Running kubectl encountered an error: Unable to connect to the server: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2022-08-05 15:15:07
task 29f1645fbe1a4bb29898b1e71a8b1489...
I can see the outputs from argo, so I know if some resource has been created but I can't inspect the full logs,
the ones I have available are all records similar toNo tasks in queue 80247f703053470fa60718b4dff7a576