Hi Everyone! So, I'M Having A Problem With The Auto Detect Dependencies When Running A Task Remotly. The Problem Is That When I Import Some Function From A File In Another Folder, That Task Doesn'T Catch The Files Depencies. Given A Folder Structure:

Answered

Hi everyone!

So, I'm having a problem with the auto detect dependencies when running a task remotly. The problem is that when I import some function from a file in another folder, that task doesn't catch the files depencies.
Given a folder structure:
src: --_init__.py --service: ----__init__.py ----code.py main.ipynbwhere code.py:
` import numpy as np
import pandas as pd

def func():
print(pd.version)
arr = np.array([0])

print("used numpy array") `and main.py:

` from src.service.code import func
from clearml import Task

task: Task = Task.init(project_name="Teste", task_name="test")

if name == "main":
func() `The task only sees the clearml dependency, so when I run this task on a remote agent, it results in a importerror, because it didn't installed pandas (numpy is installed because of clearml).
Is this a bug?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GrotesqueOctopus42
				
					0
					 × 1

Votes Newest

Answers 22

That doesn't seem normal, let me ask around and get back to you

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					TimelyMouse69
				
					0
					 × 1

The workaround of importing pandas and numpy is very limited, because once your code.py imports from another files (an utils.py, for example), you can get lost pretty quickly with the libs.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GrotesqueOctopus42
				
					0
					 × 1

I added a requirements.txt file on the same lvl of main.ipynb. But it still didn't detect the dependency and resulted in a importerror for pandas

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GrotesqueOctopus42
				
					0
					 × 1

When running the task locally:
` # Python 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)]

clearml == 1.8.3 And when running on a remote clearml-agent: attrs==22.1.0
certifi==2022.12.7
charset-normalizer==2.1.1
clearml==1.8.3
Cython==0.29.32
distlib==0.3.6
filelock==3.8.2
furl==2.1.3
idna==3.4
jsonschema==4.17.3
numpy==1.24.0
orderedmultidict==1.0.1
pathlib2==2.3.7.post1
Pillow==9.3.0
platformdirs==2.6.0
psutil==5.9.4
PyJWT==2.4.0
pyparsing==3.0.9
pyrsistent==0.19.2
python-dateutil==2.8.2
PyYAML==6.0
requests==2.28.1
six==1.16.0
urllib3==1.26.13
virtualenv==20.17.1 `

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GrotesqueOctopus42
				
					0
					 × 1

If you run it, what does it say in experiment list -> experiments -> execution -> installed packages?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					TimelyMouse69
				
					0
					 × 1

I tried using Task.force_requirements_env_freeze(requirements_file=requirements.txt) , before calling Task.init but it didn't work, the requirements didnt show as installed packages. So I added Task.add_requirements("requirements.txt") and it worked fine. I thinks this is a proper workaround.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GrotesqueOctopus42
				
					0
					 × 1

One work around is adding:
import pandas as pd import numpy as npon the main file. This way the depency is properly detected. But idk, it seems like this shouldn't be a problem. Did you guys managed to reproduce the error?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GrotesqueOctopus42
				
					0
					 × 1

And this is also remotely?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					TimelyMouse69
				
					0
					 × 1

yes

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GrotesqueOctopus42
				
					0
					 × 1

So for notebooks requirements are indeed not checked elsewhere.
You can however include them with using this line before Task.init

Task.force_requirements_env_freeze(requirements_file=requirements.txt)

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					TimelyMouse69
				
					0
					 × 1

Well seems like you have a solution for now?

If you still want to run it as a notebook, the following should make pip install the required packages:

import sys !{sys.executable} -m pip install -r requirements.txt
I'll check if this something we need to update in our documentation or if it's a bug.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					TimelyMouse69
				
					0
					 × 1

thanks Bart!

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GrotesqueOctopus42
				
					0
					 × 1

You can fix this by using a requirements.txt or the --packages parameter
https://clear.ml/docs/latest/docs/apps/clearml_task/#package-dependencies

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					TimelyMouse69
				
					0
					 × 1

And pandas is in your requirements.txt?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					TimelyMouse69
				
					0
					 × 1

yup

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GrotesqueOctopus42
				
					0
					 × 1

No, I use a SSH key to give the agent access. I am sure the key is working and he is cloning properly

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GrotesqueOctopus42
				
					0
					 × 1

You can find more info here: https://clear.ml/docs/latest/docs/references/sdk/task#taskforce_requirements_env_freeze

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					TimelyMouse69
				
					0
					 × 1

Did you use --git-credentials ?
https://clear.ml/docs/latest/docs/apps/clearml_session#accessing-a-git-repository

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					TimelyMouse69
				
					0
					 × 1

before asking the agent to run it, I also pushed the code and requirements to git so the agent should see it

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GrotesqueOctopus42
				
					0
					 × 1

Yess, all the files are on the same git repo and same branch, including the requirements.txt.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GrotesqueOctopus42
				
					0
					 × 1

I found that the original main code can detect the dependencies when running in a .py file. However, when running in a jupyter notebook .ipynb, it cannot.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					GrotesqueOctopus42
				
					0
					 × 1

GrotesqueOctopus42

The problem is that when I import some function from a file in another folder, that task doesn't catch the files depencies.

Just to be clear, if this is another file, you have to have all the files in the same git repo for the agent to actually be able to fetch them on the remote machine.
If you have a mix of notebooks and code, you have to have the local code in a git repo,
Make sense ?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Write your answer

2K Views

22 Answers

2 years ago