Hard to tell 🤔 , I'm using the version of conda that comes with mamba but I don't think it changes anything. It never gave me any problem and it seems the clearml-agent is able to normally use it to install python and torch before failing on open3d. Here's the full output:
That is odd, can you send the full Task log? (Maybe some oddity with conda/pip ?!)
All right that made sense so I checked but I still think it's using the right version.
This is the Installed Package section of the experiment I'm trying to enqueue as reported on the ClearML dashboard:# Python 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:18) [GCC 10.3.0] PyYAML == 5.4.1 clearml == 1.6.2 matplotlib == 3.5.2 numpy == 1.23.1 open3d == 0.15.2 plotly == 5.9.0 scipy == 1.9.0 torch == 1.10.2 tqdm == 4.64.0 .
So it knows it has to use python 3.8 .
Also this is what it says when it fails:
` Executing Conda: /home/IIT.LOCAL/arosasco/mambaforge/condabin/conda install -p /home/IIT.LOCAL/arosasco/.clearml/venvs-builds/3.8 -c pytorch -c conda-forge -c defaults 'pip<20.2' --quiet --json
Pass
Conda: Trying to install requirements:
['cudatoolkit=10.2', 'numpy~=1.23.1', 'pytorch~=1.10.2', 'graphviz', 'python-graphviz', 'kiwisolver']
Executing Conda: /home/IIT.LOCAL/arosasco/mambaforge/condabin/conda env update -p /home/IIT.LOCAL/arosasco/.clearml/venvs-builds/3.8 --file /tmp/user/1021449697/conda_envq0p4wyej.yml
--quiet --json
By downloading and using the CUDA Toolkit conda packages, you accept the terms and conditions of the CUDA End User License Agreement (EULA):
Pass
Conda: Installing requirements: step 2 - using pip:
['PyYAML==5.4.1', 'clearml==1.6.2', 'matplotlib==3.5.2', 'open3d==0.15.2', 'plotly==5.9.0', 'scipy==1.9.0', 'tqdm==4.64.0']
Collecting PyYAML==5.4.1
Using cached PyYAML-5.4.1-cp38-cp38-manylinux1_x86_64.whl (662 kB)
Collecting matplotlib==3.5.2
Using cached matplotlib-3.5.2-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl (11.3 MB)
ERROR: Could not find a version that satisfies the requirement open3d==0.15.2 (from -r /tmp/user/1021449697/cached-reqsfbqzy2c4.txt (line 4)) (from versions: 0.10.0.0, 0.11.0, 0.11.1, 0.11.2, 0.12.0, 0.13.0)
ERROR: No matching distribution found for open3d==0.15.2 (from -r /tmp/user/1021449697/cached-reqsfbqzy2c4.txt (line 4))
Command 'source /home/IIT.LOCAL/arosasco/mambaforge/etc/profile.d/conda.sh && conda activate /home/IIT.LOCAL/arosasco/.clearml/venvs-builds/3.8 && pip install -r /tmp/user/1021449697/cached-reqsfbqzy2c4.txt' returned non-zero exit status 1.
clearml_agent: ERROR: Could not install task requirements!
Command 'source /home/IIT.LOCAL/arosasco/mambaforge/etc/profile.d/conda.sh && conda activate /home/IIT.LOCAL/arosasco/.clearml/venvs-builds/3.8 && pip install -r /tmp/user/1021449697/cached-reqsfbqzy2c4.txt' returned non-zero exit status 1.
Leaving process id 58377
DONE: Running task '20d7922b702247189beaad4e9faf6d24', exit status 1
Process failed, exit code 1No tasks in queue b90c6545c9474abcbeb10cef1876e891 It's trying to build the environment in
.clearml/venvs-build/3.8 ` .
Could not find a version that satisfies the requirement open3d==0.15.2 .. from versions: 0.10.0.0, 0.11.0, 0.11.1, 0.11.2, 0.12.0, 0.13.0)
This points to the agent installing using a different python version that you run the original code, I would guess python3.6
Nope it's a linux machine with a xeon gold 5218. Anyway I installed the rc and now it seems to work a bit better: it gets past the torch installation but fails installing open3d.
ERROR: Could not find a version that satisfies the requirement open3d==0.15.2 (from -r /tmp/user/1021449697/cached-reqse9v2emqx.txt (line 4)) (from versions: 0.10.0.0, 0.11.0, 0.11.1, 0.11.2, 0.12.0, 0.13.0)
Technically 0.15.2 shouldn't give any problem. I have a conda environment on the same machine with all the packages I need and I managed to install them all without any conflicts.
Anyway, I'd like to try with one of the version specified
(from versions: 0.10.0.0, 0.11.0, 0.11.1, 0.11.2, 0.12.0, 0.13.0)
.
Can I manually specify the requirements file?
ERROR: torch-1.12.0+cu102-cp38-cp38-linux_x86_64.whl is not a supported wheel on this platform
TartBear70 could it be you are running on a new Mac M1/2 ?
Also quick question, any chance you can test with the latest RC?pip3 install clearml-agent==1.3.1rc6
SuccessfulKoala55 CostlyOstrich36 How can I solve it? If I run conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
in an environment, it installs it without any problem. I guess it's trying to install it with pip instead than conda even though I had set conda in the clearml.config file.
Since automatically building an environment is often problematic it would be a great feature to point the agent to a pre-built environment with all the packages it will need.
TartBear70 , I think the relevant log line is:ERROR: torch-1.12.0+cu102-cp38-cp38-linux_x86_64.whl is not a supported wheel on this platform.
Hi CostlyOstrich36 , I'm using clearml-agent==1.3.0.
I ran the training script on my local machine without using an agent (just logging with clearml) and everything worked fine.
I think it's trying to install the packages in /tmp/user/ because I'm inside tmux. Any way I can manually install the environment and then point the agent to it?
Hi TartBear70 ,
Did you run the experiment locally first? What versions of clearml/clearml-agent are you using?