PunySquid88 RC1 is out with a fix:pip install trains-agent==0.14.2rc1
With pleasure, I'll make sure we officially release RC1 soon :)
if you have a fix to unblock me i'll try anything haha
I'm still struggling to get this other instance working even after switching the config to use pip and restarting the agent
I just checked my aws instance with the gpu and yeah the virtual envs are using the cuda version
Check the log to see exactly where it downloaded the torch from. Just making sure it used the right repository and did not default to the pip, where it might have gotten a CPU version...
strangely enough i have another aws instance with a GPU that i've been using trains-agent to run experiments with the same dependencies, git included
I'll make sure we have conda ignore git:// packages, and pass them to the second pip stage.
yeah it seems that removing the git dependency worked
BTW: seems like conda doesn't support git+git:// packages
How about switching to pip ? you can still run the entire thing from conda env, it will just use pip & venv to install everything, other than that it should work as expected.
This is why we recommend using pip and not conda ...
PunySquid88 after removing the "//gihub" package is it working ?
So a bit of explanation on how conda is supported. First conda is not recommended, reason is, is it very easy to create a setup on conda that is un-reproducible by conda (yes, exactly that). So what trains-agent does, it tries to install all the packages it can first with conda (not one by one, because that will break conda dependencies), then the packages that it failed to install from conda, it will install using pip.
I removed torch from the requirements of the git repo dependency though hmm
See the last package in the package list:
- wget~=3.2
- trains~=0.14.1
- pybullet~=2.6.5
- gym-cartpole-swingup~=0.0.4
- //github.com/ajliu/pytorch_baselines
I see the problem now: conda is failing to install the package from the git, then it reverts to pip install, and pip just fails... " //github.com/ajliu/pytorch_baselines "
Please send the full log, I just tested it here, and it seems to be working
the console output from trains shows that it already has the cuda version set to 0, but i tried that anyways and conda is still unable to install torch
Try adding this environment variable:export TRAINS_CUDA_VERSION=0
yeah that causes the 'NoneType' object has no attribute 'lower'
error
If you edit the requirements to have
https://download.pytorch.org/whl/cpu/torch-1.4.0%2Bcpu-cp37-cp37m-linux_x86_64.whl
I tried to change the dependency to https://download.pytorch.org/whl/cpu/torch-1.4.0%2Bcpu-cp37-cp37m-linux_x86_64.whl ; sys_platform == "linux"
and try to get it to have pip install the correct whl file, but i'm getting a new error 'NoneType' object has no attribute 'lower'