Collecting pip<20.2
Using cached pip-20.1.1-py2.py3-none-any.whl (1.5 MB)
Installing collected packages: pip
Attempting uninstall: pip
Found existing installation: pip 20.0.2
Not uninstalling pip at /usr/lib/python3/dist-packages, outside environment /usr
Can't uninstall 'pip'. No files were found to uninstall.
I am running the agent with clearml-agent daemon --queue training
I am trying task.create like so:
task = Task.create(
script="test_gpu.py",
packages=["torch"],
)
@<1523701070390366208:profile|CostlyOstrich36> do you have any ideas?
This one seems to be compatible: [nvcr.io/nvidia/pytorch:22.04-py3](http://nvcr.io/nvidia/pytorch:22.04-py3)
@<1523701070390366208:profile|CostlyOstrich36> same error now 😞
Environment setup completed successfully
Starting Task Execution:
/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/torch/cuda/__init__.py:128: UserWarning: CUDA initialization: The NVIDIA driver on your system is too old (found version 11020). Please update your GPU driver by downloading and installing a new version from the URL:
Alternatively, go to:
to install a PyTorch version that has been compiled with your version of the CUDA driver. (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
return torch._C._cuda_getDeviceCount() > 0
False
Traceback (most recent call last):
File "facility_classifier/test_gpu.py", line 8, in <module>
assert torch.cuda.is_available()
AssertionError
pip install --pre torchvision --force-reinstall --index-url
None
This has been resolved now! Thank you for your help @<1523701070390366208:profile|CostlyOstrich36>
Just to make sure, run the code on the machine itself to verify that python can actually detect the driver