Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
After Trying To Execute A Task From The Queue The Agent Fails Installing The Environment:

After trying to execute a task from the queue the agent fails installing the environment:
Installing collected packages: numpy Successfully installed numpy-1.21.2 Found PyTorch version torch==1.12.0 matching CUDA version 102 ERROR: torch-1.12.0+cu102-cp38-cp38-linux_x86_64.whl is not a supported wheel on this platform. Command 'source /home/IIT.LOCAL/arosasco/mambaforge/etc/profile.d/conda.sh && conda activate /home/IIT.LOCAL/arosasco/.clearml/venvs-builds/3.9 && pip install -r /tmp/user/1021449697/ cached-reqsaulxx7n6.txt' returned non-zero exit status 1.Could the issue be related to /tmp/user since it doesn't have access to it? Also, why is it trying to install it with pip when I set conda in the clearml.config file?

  
  
Posted 2 years ago
Votes Newest

Answers 10


Hi TartBear70 ,

Did you run the experiment locally first? What versions of clearml/clearml-agent are you using?

  
  
Posted 2 years ago

Hi CostlyOstrich36 , I'm using clearml-agent==1.3.0.
I ran the training script on my local machine without using an agent (just logging with clearml) and everything worked fine.

I think it's trying to install the packages in /tmp/user/ because I'm inside tmux. Any way I can manually install the environment and then point the agent to it?

  
  
Posted 2 years ago

TartBear70 , I think the relevant log line is:
ERROR: torch-1.12.0+cu102-cp38-cp38-linux_x86_64.whl is not a supported wheel on this platform.

  
  
Posted 2 years ago

SuccessfulKoala55 CostlyOstrich36 How can I solve it? If I run conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch in an environment, it installs it without any problem. I guess it's trying to install it with pip instead than conda even though I had set conda in the clearml.config file.

Since automatically building an environment is often problematic it would be a great feature to point the agent to a pre-built environment with all the packages it will need.

  
  
Posted 2 years ago

ERROR: torch-1.12.0+cu102-cp38-cp38-linux_x86_64.whl is not a supported wheel on this platform
TartBear70 could it be you are running on a new Mac M1/2 ?

Also quick question, any chance you can test with the latest RC?
pip3 install clearml-agent==1.3.1rc6

  
  
Posted 2 years ago

Nope it's a linux machine with a xeon gold 5218. Anyway I installed the rc and now it seems to work a bit better: it gets past the torch installation but fails installing open3d.

ERROR: Could not find a version that satisfies the requirement open3d==0.15.2 (from -r /tmp/user/1021449697/cached-reqse9v2emqx.txt (line 4)) (from versions: 0.10.0.0, 0.11.0, 0.11.1, 0.11.2, 0.12.0, 0.13.0)

Technically 0.15.2 shouldn't give any problem. I have a conda environment on the same machine with all the packages I need and I managed to install them all without any conflicts.

Anyway, I'd like to try with one of the version specified

(from versions: 0.10.0.0, 0.11.0, 0.11.1, 0.11.2, 0.12.0, 0.13.0)

.

Can I manually specify the requirements file?

  
  
Posted 2 years ago

Could not find a version that satisfies the requirement open3d==0.15.2 .. from versions: 0.10.0.0, 0.11.0, 0.11.1, 0.11.2, 0.12.0, 0.13.0)

This points to the agent installing using a different python version that you run the original code, I would guess python3.6

  
  
Posted 2 years ago

All right that made sense so I checked but I still think it's using the right version.

This is the Installed Package section of the experiment I'm trying to enqueue as reported on the ClearML dashboard:
# Python 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:18) [GCC 10.3.0] PyYAML == 5.4.1 clearml == 1.6.2 matplotlib == 3.5.2 numpy == 1.23.1 open3d == 0.15.2 plotly == 5.9.0 scipy == 1.9.0 torch == 1.10.2 tqdm == 4.64.0 .So it knows it has to use python 3.8 .

Also this is what it says when it fails:

` Executing Conda: /home/IIT.LOCAL/arosasco/mambaforge/condabin/conda install -p /home/IIT.LOCAL/arosasco/.clearml/venvs-builds/3.8 -c pytorch -c conda-forge -c defaults 'pip<20.2' --quiet --json

Pass

Conda: Trying to install requirements:
['cudatoolkit=10.2', 'numpy~=1.23.1', 'pytorch~=1.10.2', 'graphviz', 'python-graphviz', 'kiwisolver']
Executing Conda: /home/IIT.LOCAL/arosasco/mambaforge/condabin/conda env update -p /home/IIT.LOCAL/arosasco/.clearml/venvs-builds/3.8 --file /tmp/user/1021449697/conda_envq0p4wyej.yml
--quiet --json
By downloading and using the CUDA Toolkit conda packages, you accept the terms and conditions of the CUDA End User License Agreement (EULA):

Pass
Conda: Installing requirements: step 2 - using pip:
['PyYAML==5.4.1', 'clearml==1.6.2', 'matplotlib==3.5.2', 'open3d==0.15.2', 'plotly==5.9.0', 'scipy==1.9.0', 'tqdm==4.64.0']
Collecting PyYAML==5.4.1
Using cached PyYAML-5.4.1-cp38-cp38-manylinux1_x86_64.whl (662 kB)
Collecting matplotlib==3.5.2
Using cached matplotlib-3.5.2-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl (11.3 MB)
ERROR: Could not find a version that satisfies the requirement open3d==0.15.2 (from -r /tmp/user/1021449697/cached-reqsfbqzy2c4.txt (line 4)) (from versions: 0.10.0.0, 0.11.0, 0.11.1, 0.11.2, 0.12.0, 0.13.0)
ERROR: No matching distribution found for open3d==0.15.2 (from -r /tmp/user/1021449697/cached-reqsfbqzy2c4.txt (line 4))
Command 'source /home/IIT.LOCAL/arosasco/mambaforge/etc/profile.d/conda.sh && conda activate /home/IIT.LOCAL/arosasco/.clearml/venvs-builds/3.8 && pip install -r /tmp/user/1021449697/cached-reqsfbqzy2c4.txt' returned non-zero exit status 1.

clearml_agent: ERROR: Could not install task requirements!
Command 'source /home/IIT.LOCAL/arosasco/mambaforge/etc/profile.d/conda.sh && conda activate /home/IIT.LOCAL/arosasco/.clearml/venvs-builds/3.8 && pip install -r /tmp/user/1021449697/cached-reqsfbqzy2c4.txt' returned non-zero exit status 1.

Leaving process id 58377
DONE: Running task '20d7922b702247189beaad4e9faf6d24', exit status 1
Process failed, exit code 1No tasks in queue b90c6545c9474abcbeb10cef1876e891 It's trying to build the environment in .clearml/venvs-build/3.8 ` .

  
  
Posted 2 years ago

That is odd, can you send the full Task log? (Maybe some oddity with conda/pip ?!)

  
  
Posted 2 years ago

Hard to tell 🤔 , I'm using the version of conda that comes with mamba but I don't think it changes anything. It never gave me any problem and it seems the clearml-agent is able to normally use it to install python and torch before failing on open3d. Here's the full output:

  
  
Posted 2 years ago