I'M Having Issues Running Trains-Agent On My Aws, It Seems To Not Be Able To Install Pytorch... I Have

Answered

I'm having issues running trains-agent on my aws, it seems to not be able to install pytorch...
I have miniconda 4.8.2 , python 3.7.6 , trains-agent 0.14.1 installed, and I'm trying to run on an aws instance that does not have a cpu
but I'm getting the following error message
ERROR: Could not find a version that satisfies the requirement pytorch~=1.4.0 (from -r /tmp/cached-reqsc_vsu0t6.txt (line 8)) (from versions: 0.1.2, 1.0.2) ERROR: No matching distribution found for pytorch~=1.4.0 (from -r /tmp/cached-reqsc_vsu0t6.txt (line 8)) Command 'source /home/ubuntu/miniconda3/etc/profile.d/conda.sh && conda activate /home/ubuntu/.trains/venvs-builds/3.7 && pip install -r /tmp/cached-reqsc_vsu0t6.txt' returned non-zero exit status 1. trains_agent: ERROR: Could not install task requirements! Command 'source /home/ubuntu/miniconda3/etc/profile.d/conda.sh && conda activate /home/ubuntu/.trains/venvs-builds/3.7 && pip install -r /tmp/cached-reqsc_vsu0t6.txt' returned non-zero exit status 1.The dependencies I have on my trains dashboard are
gym == 0.17.0 gym_cartpole_swingup == 0.0.4 numpy == 1.18.1 pybullet == 2.6.5 tensorboard == 2.1.0 torch == 1.4.0 trains == 0.14.1 wget == 3.2 box2d-py == 2.3.8Any idea what's wrong here?

  				
Posted 
	4 years ago

					More  		
  Report
		
					PunySquid88
				
					0
					 × 1

Votes Newest

Answers 30

I removed torch from the requirements of the git repo dependency though hmm

  				
Posted 
	4 years ago

					More  		
  Report
		
					PunySquid88
				
					0
					 × 1

strangely enough i have another aws instance with a GPU that i've been using trains-agent to run experiments with the same dependencies, git included

  				
Posted 
	4 years ago

					More  		
  Report
		
					PunySquid88
				
					0
					 × 1

I'm still struggling to get this other instance working even after switching the config to use pip and restarting the agent

  				
Posted 
	4 years ago

					More  		
  Report
		
					PunySquid88
				
					0
					 × 1

yeah that causes the 'NoneType' object has no attribute 'lower' error

  				
Posted 
	4 years ago

					More  		
  Report
		
					PunySquid88
				
					0
					 × 1

yeah it seems that removing the git dependency worked

  				
Posted 
	4 years ago

					More  		
  Report
		
					PunySquid88
				
					0
					 × 1

I'm running in venv mode

  				
Posted 
	4 years ago

					More  		
  Report
		
					PunySquid88
				
					0
					 × 1

If you edit the requirements to have
https://download.pytorch.org/whl/cpu/torch-1.4.0%2Bcpu-cp37-cp37m-linux_x86_64.whl

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

So a bit of explanation on how conda is supported. First conda is not recommended, reason is, is it very easy to create a setup on conda that is un-reproducible by conda (yes, exactly that). So what trains-agent does, it tries to install all the packages it can first with conda (not one by one, because that will break conda dependencies), then the packages that it failed to install from conda, it will install using pip.

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

the console output from trains shows that it already has the cuda version set to 0, but i tried that anyways and conda is still unable to install torch

  				
Posted 
	4 years ago

					More  		
  Report
		
					PunySquid88
				
					0
					 × 1

I tried to change the dependency to https://download.pytorch.org/whl/cpu/torch-1.4.0%2Bcpu-cp37-cp37m-linux_x86_64.whl ; sys_platform == "linux" and try to get it to have pip install the correct whl file, but i'm getting a new error 'NoneType' object has no attribute 'lower'

  				
Posted 
	4 years ago

					More  		
  Report
		
					PunySquid88
				
					0
					 × 1

I'll make sure we have conda ignore git:// packages, and pass them to the second pip stage.

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I see the problem now: conda is failing to install the package from the git, then it reverts to pip install, and pip just fails... " //github.com/ajliu/pytorch_baselines "

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

sick thanks!

  				
Posted 
	4 years ago

					More  		
  Report
		
					PunySquid88
				
					0
					 × 1

with conda ?!

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

BTW: seems like conda doesn't support git+git:// packages
How about switching to pip ? you can still run the entire thing from conda env, it will just use pip & venv to install everything, other than that it should work as expected.

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

if you have a fix to unblock me i'll try anything haha

  				
Posted 
	4 years ago

					More  		
  Report
		
					PunySquid88
				
					0
					 × 1

This is why we recommend using pip and not conda ...
PunySquid88 after removing the "//gihub" package is it working ?

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

With pleasure, I'll make sure we officially release RC1 soon :)

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

yep

  				
Posted 
	4 years ago

					More  		
  Report
		
					PunySquid88
				
					0
					 × 1

PunySquid88 RC1 is out with a fix:
pip install trains-agent==0.14.2rc1

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Try adding this environment variable:
export TRAINS_CUDA_VERSION=0

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

here's the log

  				
Posted 
	4 years ago

					More  		
  Report
		
					PunySquid88
				
					0
					 × 1

strange ...

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I just checked my aws instance with the gpu and yeah the virtual envs are using the cuda version

  				
Posted 
	4 years ago

					More  		
  Report
		
					PunySquid88
				
					0
					 × 1

Check the log to see exactly where it downloaded the torch from. Just making sure it used the right repository and did not default to the pip, where it might have gotten a CPU version...

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

See if this helps

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

See the last package in the package list:

wget~=3.2
trains~=0.14.1
pybullet~=2.6.5
gym-cartpole-swingup~=0.0.4
//github.com/ajliu/pytorch_baselines

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

PunySquid88 do you want to test a fix?

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Please send the full log, I just tested it here, and it seems to be working

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

that did it! thank you so much 🙂

  				
Posted 
	4 years ago

					More  		
  Report
		
					PunySquid88
				
					0
					 × 1

Write your answer

1K Views

30 Answers

4 years ago

2 years ago