Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello! Since Today I Get

Hello!
Since today I get AssertionError: Torch not compiled with CUDA enabled for PyTorch 1.8.
Tasks that I submitted yesterday to the queue are also not working, even though they ran yesterday. PyTorch 1.7 based tasks work fine. Any idea what I could have done wrong?

  
  
Posted 3 years ago
Votes Newest

Answers 161


Hi @<1523701868901961728:profile|ReassuredTiger98> when you get to it...
please download the wheel, then install it with

pip3 install -U clearml_agent-0.17.3rc0-py3-none-any.whl

Then run the daemon with the additional --debug argument, basically:

clearml-agent --debug daemon --foreground ...

Once the agent is running please send the Task's log from your console 🙂

  
  
Posted 3 years ago

ok, thanks!

  
  
Posted 3 years ago

Where again does clearml place the venv?

Usually ~/.clearml/venvs-builds/<python version>/
Multiple agents will be venvs-builds.1 and so on

  
  
Posted 3 years ago

Quick question: Where again does clearml place the venv? I wanna take a look into it after the task has failed

  
  
Posted 3 years ago

Perfect! I have to thank you for helping me, not the other way around!

  
  
Posted 3 years ago

Thanks! Tomorrow is great, I'll put the wheel here 🙂

  
  
Posted 3 years ago

Sure, but I will try it tomorrow then.

  
  
Posted 3 years ago

I can install pytorch just fine locally on the agent, when I do not use clearml(-agent)

My thinking is the issue might be on the env file we are passing to conda, I can't find any other diff.
BTW:
@<1523701868901961728:profile|ReassuredTiger98> Can I send a specific wheel with mode debug prints for you to check (basically it will print the conda env YAML it is using)?

  
  
Posted 3 years ago

So only short update for today: I did not yet start a run with conda 4.7.12.
But one question: Actually conda can not be at fault here, right? I can install pytorch just fine locally on the agent, when I do not use clearml(-agent)

  
  
Posted 3 years ago

No worries, gnight :)

  
  
Posted 3 years ago

I will try again tomorrow. It s getting late! Thank you for helping so far!

  
  
Posted 3 years ago

Did not happen with conda 4.9.2

  
  
Posted 3 years ago

I guess that has nothing to do with the diff version, right ?

  
  
Posted 3 years ago

Mhhm, now conda env creation takes forever since it probably resolves conflicts. At least that is what is happening when I tried to manually install my environment

  
  
Posted 3 years ago

🤞

  
  
Posted 3 years ago

Installed miniconda finally, now trying to run the task

  
  
Posted 3 years ago

Do you know how I can make sure I do not have CUDA or a broken installation installed?

I don't think this is the case, it is quite specifically installing the CPU version.
BTW: after the agent fails it will not remove the venv, so you can get into it and check, from the log it will be in: /home/tim/.clearml/venvs-builds/3.7

  
  
Posted 3 years ago

Do you know how I can make sure I do not have CUDA or a broken installation installed?

  
  
Posted 3 years ago

My driver says "CUDA Version: 11.2" (I am not even sure this is correct, since I do not remember installing code in this machine, but idk) and there is no pytorch for 11.2, so maybe it fallbacks to cpu?

For some reason it detect CUDA 11.1 (I assume this is what you have installed, the driver CUDA version is the highest it will support not necessary what you have installed)

  
  
Posted 3 years ago

My driver says "CUDA Version: 11.2" (I am not even sure this is correct, since I do not remember installing code in this machine, but idk) and there is no pytorch for 11.2, so maybe it fallbacks to cpu?

  
  
Posted 3 years ago

Does clearml resolve the CUDA Version from driver or conda?

Actually it starts with the default CUDA based on the host driver, but when it installs the conda env it takes it from the "installed packages" (i.e. the one you used to execute the code in the first place)

Regrading link, I could not find the exact version bu this is close enough I guess:
None

  
  
Posted 3 years ago

I mean the version which it bases the PyTorch installation on.

  
  
Posted 3 years ago

One question: Does clearml resolve the CUDA Version from driver or conda?

  
  
Posted 3 years ago

Let me check

  
  
Posted 3 years ago

Do you know how I can get this version?

  
  
Posted 3 years ago

Thanks!

  
  
Posted 3 years ago

sure.

  
  
Posted 3 years ago

Could you test with 4.7.5 ?

  
  
Posted 3 years ago

conda 4.9.2

  
  
Posted 3 years ago

'conda --version'

  
  
Posted 3 years ago
12K Views
161 Answers
3 years ago
5 months ago
Tags