i am having this same issue—installing pytorch via pip. but i am not specifying a version, and the agent is not able to install pytorch.
even if i specify a version (e.g. torch<2.0
), it fails.
i guess this is a pip problem, is there a known pip version that works correctly?
seems like pip 20.1.1 has the issue, but >= 22.2.2 do not.
Notice we changed the value there, it now has two versions, pne for python 3.10 < and one for python 3.10>=
The main reason is that pip changed their resolving algorithm, and the new one can break its own dependencies (i.e. pip freeze > requirements.txt -> pip install might not actually work)
None
but there was a pip_version: “<20.2” line in my
clearml.conf` , which would possibly have been a default in the config file like, 2 years ago or something
i tried lots of things, but values in the conf file (specifically the pip and cuda versions) overriding things in my code/env confused me for a long time
I installed as told on pytorch.org : pip3 install --pre torch torchvision torchaudio --index-url
None
seems like pip 20.1.1 has the issue, but >= 22.2.2 do not.
I mean that locally I was able to install the correct version without a problem.
Hi @<1523701868901961728:profile|ReassuredTiger98> , what do you mean by "working fine locally"?
@<1523701868901961728:profile|ReassuredTiger98> how did you install the nightly locally ?
Can you also provide the full log?
the issue also may have been fixed somewhere between 20.1 and 22.2, i didn’t test versions in between those two
i noticed that the agent was downgrading to pip=20.1.1 at every attempt, so i added
Task.add_requirements("pip", "23.1.2")
and even then, it downgrades to 20.1.1?
Do you want to open an issue in pip?
Funny enough this works in:
pip3 install "torch >=2.1.0.*, <2.1.1.*" --extra-index-url
ah, my mistake, that’s an issue in my conf file.
So this is verry odd, it looks like a pip bug:
The agent is trying to install torch==2.1.0.*
because by default it ignores the 4th+ parts (they are unstable and torch have tendency to remove them) . and for some reason pip will not match 2.1.0.*
with for example "2.1.0.dev20230306+cu118"
but based on the docs it should work:
see here: None
As a workaround you can always edit and change to the final url for example: so instead of:
torch == 2.1.0.dev20230429+cu118
you should have just the the link (notice no torch package name, but make sure the python / arch is corect)
Thanks for researching this issue. If you have time, you can create the issue since you are way more knowledgeable, but I can also open it if you do not have time 🙂