Reputation
Badges 1
212 × Eureka!However, the subprocess calls are somewhat important to our code base thus the problem
Yea I did similar. I think the crux of the issue is the subprocess calls I removed.
Okay, makes sense. So there is no copying of the data to the pod, it is simply references via the EFS
yea let me unwind some changes so I can pinpoint the issue
Can you fix this or should I open a PR. I'm blocked by this.
Okay so I just tried this and immediately I'm getting errors Failed to establish a new connection: because the file server URL in my clearml.conf is the k8 dns name. So I'm sort of stuck because if I revert it to the public DNS name, then upon Dataset.get I will get same failure.
I suppose a short term hack would to just edit the /etc/hosts file and redirect the public url to k8 dns url?
I don't know how to get past this? My k8 pods shouldn't need to reach out to the public file server URL.
If I do both everything works, except then I lose clearML tracking (scalars, outputs, etc)
"additionalInfo": { "inBytes": "438", "localPort": "9134", "outBytes": "401", "unusual": "80", "value": "{\"inBytes\":\"438\",\"localPort\":\"9134\",\"outBytes\":\"401\",\"unusual\":\"80\"}", "type": "default" },
` "ipAddressV4": "165.160.15.20",
"organization": {
"asn": "19574",
"asnOrg": "CSC",
"isp": "Corporation Service Company",
"org": "Corporation Service Company"
},
"country": {
"countryName": "United States"
},
"city": {
...
yea, does the enterprise version have more functionality like this?
Yes! Thanks so much for the quick turnaround
For example, in my agent helm yaml, I have
` queue: default
podTemplate:
nodeSelector:
purpose: gpu-nvidia-t4-c8-m32-g1-od `
Sure. My git repo myProject.git does not have file.json checked into VCS. I'd like to add this file at experiment runtime or equivalent.
It seems like https://github.com/allegroai/clearml-helm-charts/blob/main/charts/clearml-agent/values.yaml#L72-L80 doesn't actually do anything as the values set here aren't applied in the agent template
Just curious, if https://github.com/allegroai/clearml-helm-charts/blob/19a6785a03b780c2d22da1e79bcd69ac9ffcd839/charts/clearml-agent/values.yaml#L50 is a value I can set, where is it used? It would be great if it overrides the Dataset.get embedded url parsed from my clearml conf file
Okay, seems like there are ways to do it, just need to be a bit clever
Would I copy and paste this block to produce another queue and k8 glue agent?
yes, I see in the UI how to create a new queue. How do I associate that queue with a nodeSelector though?
err maybe not, I dont know where its being fetched