Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello! Since Today I Get

Hello!
Since today I get AssertionError: Torch not compiled with CUDA enabled for PyTorch 1.8.
Tasks that I submitted yesterday to the queue are also not working, even though they ran yesterday. PyTorch 1.7 based tasks work fine. Any idea what I could have done wrong?

  
  
Posted 4 years ago
Votes Newest

Answers 161


@<1523701868901961728:profile|ReassuredTiger98> what are you getting with:

nvidia-smi

And here:

ls -la /usr/local/
  
  
Posted 4 years ago

Complete conda log

  
  
Posted 4 years ago

I tried to run the task with detect_with_conda_freeze: false instead of true and got

Executing Conda: /home/tim/miniconda3/condabin/conda install -p /home/tim/.clearml/venvs-builds/3.8 -c defaults -c conda-forge -c pytorch 'pip<20.2' --quiet --json
Pass
Conda: Trying to install requirements:
['pytorch~=1.8.0']
Executing Conda: /home/tim/miniconda3/condabin/conda env update -p /home/tim/.clearml/venvs-builds/3.8 --file /tmp/conda_envh7rq4qmc.yml --quiet --json
Conda error: UnsatisfiableError: The following specifications were found to be incompatible with a past
explicit spec that is not an explicit spec in this operation (cudatoolkit):

  - pytorch~=1.8.0 -> cudatoolkit[version='>=10.1,<10.2|>=10.2,<10.3']

The following specifications were found to be incompatible with each other:



Package cudatoolkit conflicts for:
cudatoolkit=11.0
Conda: Installing requirements: step 2 - using pip:
['clearml==0.17.4', 'tensorboard==2.4.1', 'pytorch~=1.8.0']
Collecting tensorboard==2.4.1
  Using cached tensorboard-2.4.1-py3-none-any.whl (10.6 MB)
ERROR: Could not find a version that satisfies the requirement pytorch~=1.8.0 (from -r /tmp/cached-reqsubuv0zrf.txt (line 3)) (from versions: 0.1.2, 1.0.2)
ERROR: No matching distribution found for pytorch~=1.8.0 (from -r /tmp/cached-reqsubuv0zrf.txt (line 3))
Command 'source /home/tim/miniconda3/etc/profile.d/conda.sh && conda activate /home/tim/.clearml/venvs-builds/3.8 && pip install -r /tmp/cached-reqsubuv0zrf.txt' returned non-zero exit status 1.
  
  
Posted 4 years ago

Thank you! 🙂

  
  
Posted 4 years ago

Could you try to do:

CUDA_VERSION="11.1" clearml-agent ...
  
  
Posted 4 years ago

No worries, gnight :)

  
  
Posted 4 years ago

This is the file which installs the GPU version

  
  
Posted 4 years ago

Do you know how I can make sure I do not have CUDA or a broken installation installed?

I don't think this is the case, it is quite specifically installing the CPU version.
BTW: after the agent fails it will not remove the venv, so you can get into it and check, from the log it will be in: /home/tim/.clearml/venvs-builds/3.7

  
  
Posted 4 years ago

It is now looking for conflicts.

  
  
Posted 4 years ago

conda 4.9.2

  
  
Posted 4 years ago

Same error.

  
  
Posted 4 years ago
106K Views
161 Answers
4 years ago
one year ago
Tags