Hello, Has Anyone Know Any Solutions To This?

Answered

Hello, has anyone know any solutions to this? ImportError: cannot import name '_get_cpp_backtrace' from 'torch._C' (/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/torch/_C.cpython-38-x86_64-linux-gnu.so) Happened when cloning and running a task on an agent on a different machine. I tried to clear the cache and have matching cuda versions. Any suggestions?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					DeliciousKoala34
				
					0
					 × 1

Votes Newest

Answers 7

Hi, I changed it to 1.13.0, but it still threw the same error.

This is odd, just so we can make the agent better, any chance you can send the Task log ?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hi DeliciousKoala34

Happened when cloning and running a task on an agent on a different machine. I

sounds like torch internal issue, can you send the full log of the remote Task ?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Thanks @<1523702652678967296:profile|DeliciousKoala34> I think I know what the issue is!
The container has 1.3.0a and you need 1.3.0 this is why it is re-downloading (I'll make sure the agent can sort it out, becuase this is Nvidia's version in reality it should be a perfect match)

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

The full log

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					DeliciousKoala34
				
					0
					 × 1

Hi, I changed it to 1.13.0, but it still threw the same error. In the end I just changed to a bullseye container instead(since the nvidia container is not a must have), and it works now, but for some reason it doesnt auto detect all of my packages so I had to explicitly add them. But yeah, thanks for the help, I should have dug a bit deeper on my issue.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					DeliciousKoala34
				
					0
					 × 1

Jup, here it is.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					DeliciousKoala34
				
					0
					 × 1

Check the log, the container has torch 1.13.0 but the task requires torch==1.13.1
Now torch package inside those nvidia prepackaged containers are compiled a bit differently . What I suspect happens is the torch wheel from pytorch is not compatible with this container . Easiest fix , change the task requirments to 1.13
Wdyt ?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Write your answer

2K Views

7 Answers

2 years ago