Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi Everyone! So, I'M Having A Problem With The Auto Detect Dependencies When Running A Task Remotly. The Problem Is That When I Import Some Function From A File In Another Folder, That Task Doesn'T Catch The Files Depencies. Given A Folder Structure:

Hi everyone!

So, I'm having a problem with the auto detect dependencies when running a task remotly. The problem is that when I import some function from a file in another folder, that task doesn't catch the files depencies.
Given a folder structure:
src: --_init__.py --service: ----__init__.py ----code.py main.ipynbwhere code.py:
` import numpy as np
import pandas as pd

def func():
print(pd.version)
arr = np.array([0])

print("used numpy array") `and main.py:

` from src.service.code import func
from clearml import Task

task: Task = Task.init(project_name="Teste", task_name="test")

if name == "main":
func() `The task only sees the clearml dependency, so when I run this task on a remote agent, it results in a importerror, because it didn't installed pandas (numpy is installed because of clearml).
Is this a bug?

  
  
Posted one year ago
Votes Newest

Answers 22


The workaround of importing pandas and numpy is very limited, because once your code.py imports from another files (an utils.py, for example), you can get lost pretty quickly with the libs.

  
  
Posted one year ago

You can fix this by using a requirements.txt or the --packages parameter
https://clear.ml/docs/latest/docs/apps/clearml_task/#package-dependencies

  
  
Posted one year ago

When running the task locally:
` # Python 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)]

clearml == 1.8.3 And when running on a remote clearml-agent: attrs==22.1.0
certifi==2022.12.7
charset-normalizer==2.1.1
clearml==1.8.3
Cython==0.29.32
distlib==0.3.6
filelock==3.8.2
furl==2.1.3
idna==3.4
jsonschema==4.17.3
numpy==1.24.0
orderedmultidict==1.0.1
pathlib2==2.3.7.post1
Pillow==9.3.0
platformdirs==2.6.0
psutil==5.9.4
PyJWT==2.4.0
pyparsing==3.0.9
pyrsistent==0.19.2
python-dateutil==2.8.2
PyYAML==6.0
requests==2.28.1
six==1.16.0
urllib3==1.26.13
virtualenv==20.17.1 `

  
  
Posted one year ago

before asking the agent to run it, I also pushed the code and requirements to git so the agent should see it

  
  
Posted one year ago

If you run it, what does it say in experiment list -> experiments -> execution -> installed packages?

  
  
Posted one year ago

That doesn't seem normal, let me ask around and get back to you

  
  
Posted one year ago

I tried using Task.force_requirements_env_freeze(requirements_file=requirements.txt) , before calling Task.init but it didn't work, the requirements didnt show as installed packages. So I added Task.add_requirements("requirements.txt") and it worked fine. I thinks this is a proper workaround.

  
  
Posted one year ago

thanks Bart!

  
  
Posted one year ago

Yess, all the files are on the same git repo and same branch, including the requirements.txt.

  
  
Posted one year ago

And pandas is in your requirements.txt?

  
  
Posted one year ago

And this is also remotely?

  
  
Posted one year ago

yes

  
  
Posted one year ago

GrotesqueOctopus42

The problem is that when I import some function from a file in another folder, that task doesn't catch the files depencies.

Just to be clear, if this is another file, you have to have all the files in the same git repo for the agent to actually be able to fetch them on the remote machine.
If you have a mix of notebooks and code, you have to have the local code in a git repo,
Make sense ?

  
  
Posted one year ago

One work around is adding:
import pandas as pd import numpy as npon the main file. This way the depency is properly detected. But idk, it seems like this shouldn't be a problem. Did you guys managed to reproduce the error?

  
  
Posted one year ago

No, I use a SSH key to give the agent access. I am sure the key is working and he is cloning properly

  
  
Posted one year ago

yup

  
  
Posted one year ago

So for notebooks requirements are indeed not checked elsewhere.
You can however include them with using this line before Task.init

Task.force_requirements_env_freeze(requirements_file=requirements.txt)

  
  
Posted one year ago

I found that the original main code can detect the dependencies when running in a .py file. However, when running in a jupyter notebook .ipynb, it cannot.

  
  
Posted one year ago

I added a requirements.txt file on the same lvl of main.ipynb. But it still didn't detect the dependency and resulted in a importerror for pandas

  
  
Posted one year ago

Well seems like you have a solution for now?

If you still want to run it as a notebook, the following should make pip install the required packages:

import sys !{sys.executable} -m pip install -r requirements.txt
I'll check if this something we need to update in our documentation or if it's a bug.

  
  
Posted one year ago
1K Views
22 Answers
one year ago
one year ago
Tags
Similar posts