Unanswered
Hey, How Can We Control The Pod Of Pipelinecontroller Not Use The Gpu? According To The Documentation, The
@<1729671499981262848:profile|CooperativeKitten94> Running the following conf:
queue:
services-tasks:
templateOverrides:
resources:
requests:
nvidia.com/gpu: "1"
limits:
nvidia.com/gpu: "1"
services:
templateOverrides:
resources:
requests:
nvidia.com/gpu: "0"
limits:
nvidia.com/gpu: "0"
apiServerUrlReference: "
"
fileServerUrlReference: "
"
webServerUrlReference: "
"
basePodTemplate:
resources:
requests:
nvidia.com/gpu: "2"
limits:
nvidia.com/gpu: "2"
cause the agent-pod to be in a crash-loopback:
python3 k8s_glue_example.py --queue 'map[services:map[templateOverrides:map[resources:map[limits:map[nvidia.com/gpu:0]' 'requests:map[nvidia.com/gpu:0]]]]' 'services-tasks:map[templateOverrides:map[resources:map[limits:map[nvidia.com/gpu:1]' 'requests:map[nvidia.com/gpu:1]]]]]' --max-pods 2 --namespace clearml --template-yaml /root/template/template.yaml
/usr/local/lib/python3.6/dist-packages/jwt/utils.py:7: CryptographyDeprecationWarning: Python 3.6 is no longer supported by the Python core team. Therefore, support for it is deprecated in cryptography and will be removed in a future release.
from cryptography.hazmat.primitives.asymmetric.ec import EllipticCurve
usage: k8s_glue_example.py [-h] [--queue QUEUE] [--ports-mode]
[--num-of-services NUM_OF_SERVICES]
[--base-port BASE_PORT]
[--base-pod-num BASE_POD_NUM]
[--gateway-address GATEWAY_ADDRESS]
[--pod-clearml-conf POD_CLEARML_CONF]
[--overrides-yaml OVERRIDES_YAML]
[--template-yaml TEMPLATE_YAML]
[--ssh-server-port SSH_SERVER_PORT]
[--namespace NAMESPACE] [--max-pods MAX_PODS]
[--use-owner-token] [--standalone-mode]
[--child-report-tags CHILD_REPORT_TAGS [CHILD_REPORT_TAGS ...]]
k8s_glue_example.py: error: unrecognized arguments: requests:map[nvidia.com/gpu:0]]]] services-tasks:map[templateOverrides:map[resources:map[limits:map[nvidia.com/gpu:1] requests:map[nvidia.com/gpu:1]]]]]
43 Views
0
Answers
3 months ago
3 months ago