Reputation
Badges 1
212 × Eureka!it uses the default of epoch
okay makes sense now, thanks!
SuccessfulKoala55 Figured it out, I needed to use 4.2.0
Good question. The repo I'm using requires nvidia GPU and I can't get the code to run locally
Per my question above, what is the base path where the git repo is cloned?
hmm how would I add that to PYTHONPATH? Can that be done in the SETUP SHELL SCRIPT
window?
I'm not familiar with helm that well to clone this, fix it, and then test
Just curious, if https://github.com/allegroai/clearml-helm-charts/blob/19a6785a03b780c2d22da1e79bcd69ac9ffcd839/charts/clearml-agent/values.yaml#L50 is a value I can set, where is it used? It would be great if it overrides the Dataset.get embedded url parsed from my clearml conf file
maybe a cors issue?
curl --insecure -sw %{http_code}
` -o /dev/null │
│ init-k8s-glue waiting for apiserver ...
that is the containerinit logs from k8glueagent
I think the issue is the pod to pod comms can't resolve my route53 dns records
perhaps I need to use localhost
thank you for the help!
Then it tries to curl the files API and gets a 405
the API url works fine, returns 200
perhaps the 405 is from nginx
I don't see any requests
I used the values from the dashboard/configuration/api keys
I verified in the pod yaml it is set correctly
Yep I updated those as well
Okay so I just tried this and immediately I'm getting errors Failed to establish a new connection:
because the file server URL in my clearml.conf is the k8 dns name. So I'm sort of stuck because if I revert it to the public DNS name, then upon Dataset.get
I will get same failure.
Also how do I provide the k8 glue agent permissions to spin up/down ec2 nodes?
I think if I use the local service URL this problem is fixed
Gotcha, and the agent default runtime mode is docker correct? So I could install all my system dependencies in my own docker image?