The one I posted 🙂
https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch
I'm checking if I can find a way to circumvent this 🙂
btw: I am pretty sure this used to work, but then stopped work some time ago.
Unfortunately, I do not know that. Must be before October 2021 at least. I know I asked here how to use the preinstalled version and AgitatedDove14 helped me to get it work. But I cannot find the old thread 😕
ReassuredTiger98
How can I make clearml-agent use pre-installed version from the nvidia/pytorch
If the Same version is required, the agent will not try to reinstall it (the new venv the agent is creating inside the container, inherits from the preinstalled system packages)
Comes with PyTorch Version 1.12 based on a commit
. I tried
torch >= 1.11
,
torch == 1.12
If in your installed packages you have torch==1.12
the agent should not try to reinstall the package.
I am going to try it again and send you the relevant part of the logs in a minute. Maybe I am interpreting something wrong.
` =============
== PyTorch ==
NVIDIA Release 22.03 (build 33569136)
PyTorch Version 1.12.0a0+2c916ef ...
Looking in indexes: ,
Requirement already satisfied: pip in /root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages (22.0.4)
2022-04-07 16:40:57
Looking in indexes: ,
Requirement already satisfied: Cython in /opt/conda/lib/python3.8/site-packages (0.29.28)
Looking in indexes: ,
Requirement already satisfied: numpy==1.22.3 in /opt/conda/lib/python3.8/site-packages (1.22.3)
Looking in indexes: ,
Collecting setuptools==59.5.0
Downloading setuptools-59.5.0-py3-none-any.whl (952 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 952.4/952.4 KB 21.5 MB/s eta 0:00:00
?25hInstalling collected packages: setuptools
Attempting uninstall: setuptools
Found existing installation: setuptools 61.0.0
Uninstalling setuptools-61.0.0:
Successfully uninstalled setuptools-61.0.0
2022-04-07 16:41:02
Successfully installed setuptools-59.5.0
Torch CUDA 115 download page found
Torch nightly CUDA 115 download page found
Warning, could not locate PyTorch torch==1.12 matching CUDA version 115, best candidate 1.12.0.dev20220407
2022-04-07 16:41:07
Torch CUDA 113 download page found
Trying PyTorch CUDA version 113 support
Warning, could not locate PyTorch torch==1.12 matching CUDA version 113, best candidate 1.12.0.dev20220407
Torch CUDA 111 download page found
Trying PyTorch CUDA version 111 support
Warning, could not locate PyTorch torch==1.12 matching CUDA version 111, best candidate 1.10.0
Trying PyTorch CUDA version 110 support
Warning, could not locate PyTorch torch==1.12 matching CUDA version 110, best candidate 1.10.0
2022-04-07 16:41:12
Torch CUDA 102 download page found
Trying PyTorch CUDA version 102 support
Warning, could not locate PyTorch torch==1.12 matching CUDA version 102, best candidate 1.7.0
Trying PyTorch CUDA version 101 support
Warning, could not locate PyTorch torch==1.12 matching CUDA version 101, best candidate 1.10.0
Trying PyTorch CUDA version 100 support
Warning, could not locate PyTorch torch==1.12 matching CUDA version 100, best candidate 1.4.0
2022-04-07 16:41:17
Torch CUDA 92 download page found
Trying PyTorch CUDA version 92 support
Warning, could not locate PyTorch torch==1.12 matching CUDA version 92, best candidate 1.4.0
2022-04-07 16:41:22
Torch CUDA 91 download page found
Trying PyTorch CUDA version 91 support
Warning, could not locate PyTorch torch==1.12 matching CUDA version 91, best candidate 1.4.0
Trying PyTorch CUDA version 90 support
Warning, could not locate PyTorch torch==1.12 matching CUDA version 90, best candidate None
2022-04-07 16:41:27
Torch CUDA 80 download page found
Trying PyTorch CUDA version 80 support
Warning, could not locate PyTorch torch==1.12 matching CUDA version 80, best candidate None
Torch CUDA 75 download page found
Trying PyTorch CUDA version 75 support
Warning, could not locate PyTorch torch==1.12 matching CUDA version 75, best candidate None
2022-04-07 16:41:37
Error! Could not locate PyTorch version torch==1.12 matching CUDA version 75
clearml_agent: Warning: could not resolve python wheel replacement for torch==1.12
clearml_agent: ERROR: Could not install task requirements!
Exception when trying to resolve python wheel: Could not find pytorch wheel URL for: torch==1.12 with cuda 116 support
2022-04-07 16:41:37
Process failed, exit code 1 `
Maybe the difference is that I am using pipnow and I used to use conda! The NVIDIA PyTorch container uses conda. Could that be a reason?
Nvm, that does not seem to be a problem. I added a part to the logs in the post above. It shows that some packages are found from conda.
hm ReassuredTiger98 can you send the full log? I think it should have worked (but as you mentioned it might be conda/pip mix?!)
I just manually went into the docker container and ran python -m venv env --system-site-packages
and activated the virtual env.
When I run pip list
then, it correctly shows the preinstalled packages including torch 1.12.0a0+2c916ef
ReassuredTiger98 yes this is odd:
also:Warning, could not locate PyTorch torch==1.12 matching CUDA version 115, best candidate 1.12.0.dev20220407
Seems like it found a matching version and did not use it...
Let me check that
ReassuredTiger98 quick update, the issue was located, next RC will already contain a fix.
In the mean time, you can avoid it by using limiting pip version:
https://github.com/allegroai/clearml-agent/blob/715f102f6d98a44131d5bee909ee779b456c6229/docs/clearml.conf#L67pip_version: "<20.2"
Thank you very much for the fast work!
One last question: Is it possible to set the pip_version task-dependent?
One last question: Is it possible to set the pip_version task-dependent?
no... but why would it matter on a Task basis ? (meaning what would be a use case to change the pip version per Task)
I think sometimes there can be dependencies that require a newer pip version or something like that. I am not sure though. Why can we even change the pip version in the clearml.conf?
Why can we even change the pip version in the clearml.conf?
LOL mistakes learned the hard way 🙂
Basically too many times in the past pip versions were a bit broken, which is fine if they are used manually and users can reinstall a diff version, but horrible when you have an automated process like the agent, so we added a "freeze version" option, only with greater control. Make sense ?
Yea, but doesn't this feature make sense on a task level? If I remember correctly, some dependencies will sometimes require different pip versions. And dependencies are on task basis.
some dependencies will sometimes require different pip versions.
none 🙂 maybe setuptools, but not pip
version
(pip is just a utility to install packages, it will not be a dependency of one)
That I understand. But I think (old) pip versions will sometimes not resolve a package. Probably not the case the other way around.
Probably not the case the other way around.
Actually the other way around, new pip version uses new package dependency resolver that can concluded that a previous package setup is not supported (because of version conflicts) even though they worked...
It is tricky, pip is trying to get better at resolving package dependencies, but it means that old resolutions might not work which would mean old environments cannot be resorted (or "broken" env). This is the main reason not to move to pip v21+ ...
Oh, interesting!
So pip version on per task basis makes sense ;D?
LOL, if this is important we probably could add some support (meaning you will be able to specify it in the "installed packages" section, per Task).
If you find an actual scenario where it is needed, I'll make sure we support it 🙂