Hi, I Shifted My Clearml Setup To An On-Premise Disconnected Env, Which Has A Pip Repo Setup. I Noted This Warning,

Hi, i shifted my clearml setup to an on-premise disconnected env, which has a pip repo setup. I noted this warning,
Trying pip install: /root/.clearml/venvs-builds/3.6/task_repository/mnist-pytorch/requirements.txt /usr/local/lib/python3.6/dist-packages/clearml_agent/external/requirements_parser/parser.py:44: UserWarning: Private repos not supported. Skipping. warnings.warn('Private repos not supported. Skipping.') Looking in indexes: Looking in links:and subsequently this error,
Package(s) not found: torch clearml_agent: Warning: could not resolve python wheel replacement for torch==1.6.0 learml_agent: ERROR: Exception when trying to resolve python wheel: Could not find pytorch wheel URL for: torch==1.6.0 with cuda 102 supportHere's the full log. What could be the issue?

Ohh SubstantialElk6 please use agent RC3, (latest RC is somewhat broken sorry, we will pull it out)

SubstantialElk6 could you try with the latest (just released)?
pip install clearml-agent==0.17.2Then if possible, could you attach the full log of the agent's execution (Task->results->Console)

SubstantialElk6 could you post "Installed packaged" section under Execution of this specific Task?

AlertBlackbird30 , Actually the log says 10.2.
docker_cmd = nvidia/cuda:10.2-devel-ubuntu18.04 -e GIT_SSL_NO_VERIFY=true

I agree with Martin.B, it appears to be a CUDA mismatch. The version of torch is trying to use cuda 10.2 but you have
agent.default_docker.image = nvidia/cuda:10.1-runtime-ubuntu18.04that should probably be
agent.default_docker.image = nvidia/cuda:10.2-runtime-ubuntu18.04

Hi AgitatedDove14 , what version i should change it to? I'm currently on v0.17.2rc3.

Hi AgitatedDove14 , i changed everything to cuda 10.1 and tried again with the same rrror. the section as follows. I made sure torch==1.6.0+cu101 and torchvision==0.8.2+cu101 are in the pypi repo. But the same error still came up.
` # Python 3.6.9 (default, Oct 8 2020, 12:12:24) [GCC 8.4.0]
boto3 == 1.14.56
clearml == 0.17.4
numpy == 1.19.1
torch == 1.6.0
torchvision == 0.7.0

Detailed import analysis



clearml.storage: 0


pytorch_mnist.py: 14


pytorch_mnist.py: 13


pytorch_mnist.py: 8,9,10,11

IMPORT PACKAGE torchvision

pytorch_mnist.py: 12 `

I can't seem to find the fix to this. Ended up using an image that comes with torch installed.

AgitatedDove14 , would you elaborate on this resolution process?

Hi SubstantialElk6
clearml-agent was just updated, it should solve the issue.2. Notice that "torch" / "torchvision" packages are resolved by the agent based on the pytorch compatibility table. Is there a way to reproduce the issue where it fails resolving the torch version? could you send a full log?
3. If you want a specific torch version , you can put a direct link to the torch wheel, for example: https://download.pytorch.org/whl/cu102/torch-1.6.0-cp37-cp37m-linux_x86_64.whl

SubstantialElk6 it seems the auto resolve of pytorch cuda failed,
What do you have in the "installed packages" section?

