Encountered this issue too. For k8s_glue_example.py I added in argument —overrides-yaml=<yaml file> which contains
spec:
containers:
- resources:
limits:
http://nvidia.com/gpu: 1
The pod is able to be allocated with gpu.
For me too, had this issue.. I realised that k8s glue, wasnt using the GPU resource compared to running it as clearml-agent..TimelyPenguin76 suggested using the latest Cuda11.0 images, though it also didnt work.
Hi, it's a preference from my developers. They preferred that the they install the python libraries into the images, load them up into the registry. In other words, they prefer to have libraries installed at image time.
SubstantialElk6 I am having a bit of a monday morning (on a wednesday, not good)
since python is running inside a docker/cri-o/containerd in k8s anyway, what would you gain from using the installed global python libraries ?? Any libs would have to be installed at container time anyway so.. urm. yeah.
feel free to treat me like an idiot and use small words to explain, I honestly don't mind 🙂 I could be missing something in your use case (more than likely)
I would assume, from the sounds of it, that you are using the dockerfile to pip install python libs.. In which case a pip install clear-ml can also be done at image creation time.. I don't know what other methods you would be using to install python deps.. Easy_install?!?
Clear-ml agent does have a build flag install-globally.. That may get you where you want to go
I think the default action of clearml-agent k8s glue when running a task is to create a virtual env and installing the dependancies. So i'm just checking how to change that behaviour to look at global instead.
Any comments on using the global python libraries without the need to 'pip install' anything?