Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hey Guys! I'M Having Some Issues With Pytorch And Clearml. I Am Starting A New Task Using Task.Create And Setting Pytorch As A Requirement Under `Packages`. For Some Reason Pytorch With Cuda 12 Is Being Installed, But I Need Cuda 11. Do You Know How To Se

Hey guys! I'm having some issues with pytorch and clearml. I am starting a new task using task.create and setting pytorch as a requirement under packages. For some reason pytorch with CUDA 12 is being installed, but I need CUDA 11. Do you know how to set it to install CUDA 11?

  
  
Posted 8 months ago
Votes Newest

Answers 41


Just try as is first with this docker image + verify that the code can access cuda driver unrelated to the agent

  
  
Posted 8 months ago

Isn't the problem that CUDA 12 is being installed?

  
  
Posted 8 months ago

CostlyOstrich36 I'm now running the agent with --docker , and I'm using task.create(docker="nvidia/cuda:11.0.3-cudnn8-runtime-ubuntu20.04")

  
  
Posted 8 months ago

If I run nvidia-smi it returns valid output and it says the CUDA version is 11.2

  
  
Posted 8 months ago

But the process is still hanging, and not proceeding to actually running the clearml task

  
  
Posted 8 months ago

Collecting pip<20.2
Using cached pip-20.1.1-py2.py3-none-any.whl (1.5 MB)
Installing collected packages: pip
Attempting uninstall: pip
Found existing installation: pip 20.0.2
Not uninstalling pip at /usr/lib/python3/dist-packages, outside environment /usr
Can't uninstall 'pip'. No files were found to uninstall.

  
  
Posted 8 months ago

I am trying task.create like so:

task = Task.create(
    script="test_gpu.py",
    packages=["torch"],
)
  
  
Posted 8 months ago

What I dont understand is how to tell clearml to install this version of pytorch and torchvision, with cu118

  
  
Posted 8 months ago

It means that there is an issue with the drivers. I suggest trying this docker image - nvcr.io/nvidia/pytorch:23.04-py3

  
  
Posted 8 months ago

I have set agent.package_manager.pip_version="" which resolved that message

  
  
Posted 8 months ago

CUDA is the driver itself. The agent doesn't install CUDA but installs a compatible torch assuming that CUDA is properly installed.

  
  
Posted 8 months ago
49K Views
41 Answers
8 months ago
7 months ago
Tags