Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, I Have Another Problem

Hi, I have another problem 😅
in one of my agent, one experiment started without torch using GPU. In the logs of the experiment shared below, we can see that trains installed the CPU version of torch. It did not happen before (trains was correctly downloading the gpu version). In the requirements, I specified torch==1.3.1 and I have cuda 110 installed in the agent.
Package(s) not found: torch Collecting torch==1.3.1+cpu Downloading (111.8 MB)

  
  
Posted 3 years ago
Votes Newest

Answers 25


You're welcome 🙂

  
  
Posted 3 years ago

I don't know why it didn't detect it in first place

  
  
Posted 3 years ago

thanks, I will do that

  
  
Posted 3 years ago

(since you are using venv mode, if the cuda is not detected at startup time, it will not install the GPU version, as it has no CUDA support)

  
  
Posted 3 years ago

It is configured as CPU (i.e. no CUDA)

  
  
Posted 3 years ago

AgitatedDove14 one last question: how can I enforce a specific wheel to be installed?

  
  
Posted 3 years ago

agent.cuda_version = 0 agent.cudnn_version = 0

  
  
Posted 3 years ago

yes, I use --gpus parameter

  
  
Posted 3 years ago

OK but nowhere I specified that, I just checked my trains.conf file

  
  
Posted 3 years ago

That depends on what you have installed 🙂

  
  
Posted 3 years ago

cuda 10.1, I guess this is because no wheel exists for torch==1.3.1 and cuda 11.0

Correct

how can I enforce a specific wheel to be installed?

You mean like specific CUDA wheel ?
you can simple put the http link to the wheel in the "installed packages", it should work

  
  
Posted 3 years ago

agent.cuda_version = 110 agent.cudnn_version = 0

  
  
Posted 3 years ago

This is why it thinks it's CPU 🙂

  
  
Posted 3 years ago

You can always force it with environment variable CUDA_VERSION=10.1

  
  
Posted 3 years ago

This is also set in the command line.
--cpu-only or maybe without any --gpus flag at all

  
  
Posted 3 years ago

okay, now it should work 🙂

  
  
Posted 3 years ago

python3 -m trains_agent --config-file "~/trains.conf" daemon --queue default --log-level DEBUG --detached --gpus 1 > ~/trains-agent.startup.log 2>&1

  
  
Posted 3 years ago

I just started one and it wrote:
...

  
  
Posted 3 years ago

Hi JitteryCoyote63
What do you have in the agent.cuda_version ?
(you can see it printed at the beginning of the log)

  
  
Posted 3 years ago

btw shoulnd't it be CUDA_VERSION=11.0 ?

  
  
Posted 3 years ago

ho, that might be it then, thanks!

  
  
Posted 3 years ago

What you actually specified is torch the @ is kind of pip remark, pip will not actually parse it 🙂
use only the link https://download.pytorch.org/whl/cu100/torch-1.3.1%2Bcu100-cp36-cp36m-linux_x86_64.whl

  
  
Posted 3 years ago

I specified a torch @ https://download.pytorch.org/whl/cu100/torch-1.3.1%2Bcu100-cp36-cp36m-linux_x86_64.whl and it didn't detect the link, it tried to install latest version: 1.6.0

  
  
Posted 3 years ago

what do you see in the console when you start the trains-agent , it should detect the cuda version

  
  
Posted 3 years ago

I have 11.0 installed but on another machine with 11.0 installed as well, trains downloads torch for cuda 10.1, I guess this is because no wheel exists for torch==1.3.1 and cuda 11.0

  
  
Posted 3 years ago
506 Views
25 Answers
3 years ago
one year ago
Tags
Similar posts