Reputation
Badges 1
212 × Eureka!SuccessfulKoala55 It looks like it should eval to True?
Hey triggering the tasks from the CLI resolved the python pathing issues!
Also how do I provide the k8 glue agent permissions to spin up/down ec2 nodes?
I got the EFS volume mounted. Curious what advantage it would be to use the StorageManager
It will then parse the above information from my local workstation?
okay makes sense now, thanks!
I verified in the pod yaml it is set correctly
the API url works fine, returns 200
Then it tries to curl the files API and gets a 405
I don't see any requests
I can see this log message in the nginx controller"GET / HTTP/1.1" 405 178 "-" "curl/7.79.1" 95 0.003 [clearml-clearml-fileserver-8081] [] 10.36.1.61:8081 178 0.004 405 b4f5caf7665ffa1e8823a195ae41ec26
perhaps the 405 is from nginx
I just opened a shell with the api and tried to curl my files URL, and the curl just hangs. no response
Yep I updated those as well
the worker is now in the dashboard
thank you for the help!
ok yes, this is the problem
Okay, so basically the DL framework manages the master/worker relationship. I just need to use pod replicas for my k8 agents.
I think the best change would to respect the value set https://github.com/allegroai/clearml-helm-charts/blob/19a6785a03b780c2d22da1e79bcd69ac9ffcd839/charts/clearml-agent/values.yaml#L50 so you could change it down the road if infra/hosting changes. Also in this case, I'm uploading the data to the public file server URL, but my k8 pod can't reach that for security reasons.
IMO, the dataset shouldnt be tied to the clearml.conf URLs that it was uploaded with, as that URL could change. It should respect the file server URL the agent has.
Are there any work arounds to this issue? Our team is evaluating this product to potentially buy enterprise license. If we can't fetch data this is a problem.
AgitatedDove14 Will I need sudo permissions if I add this script to extra_docker_shell_script
echo "192.241.xx.xx venus.example.com venus" >> /etc/hosts
hmm how would I add that to PYTHONPATH? Can that be done in the SETUP SHELL SCRIPT
window?
I suppose a short term hack would to just edit the /etc/hosts
file and redirect the public url to k8 dns url?
When I exec into the pod, it says I need sudo, but wondering if extra_docker_shell_script
is executed as sudo already?
Is this a config file on your side or something I can change, if we had enterprise version?
As they are singular not plural
perhaps I need to use localhost
For example, in my agent helm yaml, I have
` queue: default
podTemplate:
nodeSelector:
purpose: gpu-nvidia-t4-c8-m32-g1-od `