The workaround of importing pandas and numpy is very limited, because once your code.py imports from another files (an utils.py, for example), you can get lost pretty quickly with the libs.
You can fix this by using a requirements.txt or the --packages parameter
https://clear.ml/docs/latest/docs/apps/clearml_task/#package-dependencies
When running the task locally:
` # Python 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)]
clearml == 1.8.3 And when running on a remote clearml-agent:
attrs==22.1.0
certifi==2022.12.7
charset-normalizer==2.1.1
clearml==1.8.3
Cython==0.29.32
distlib==0.3.6
filelock==3.8.2
furl==2.1.3
idna==3.4
jsonschema==4.17.3
numpy==1.24.0
orderedmultidict==1.0.1
pathlib2==2.3.7.post1
Pillow==9.3.0
platformdirs==2.6.0
psutil==5.9.4
PyJWT==2.4.0
pyparsing==3.0.9
pyrsistent==0.19.2
python-dateutil==2.8.2
PyYAML==6.0
requests==2.28.1
six==1.16.0
urllib3==1.26.13
virtualenv==20.17.1 `
before asking the agent to run it, I also pushed the code and requirements to git so the agent should see it
If you run it, what does it say in experiment list -> experiments -> execution -> installed packages?
Did you use --git-credentials ?
https://clear.ml/docs/latest/docs/apps/clearml_session#accessing-a-git-repository
That doesn't seem normal, let me ask around and get back to you
I tried using Task.force_requirements_env_freeze(requirements_file=requirements.txt)
, before calling Task.init but it didn't work, the requirements didnt show as installed packages. So I added Task.add_requirements("requirements.txt")
and it worked fine. I thinks this is a proper workaround.
Yess, all the files are on the same git repo and same branch, including the requirements.txt.
GrotesqueOctopus42
The problem is that when I import some function from a file in another folder, that task doesn't catch the files depencies.
Just to be clear, if this is another file, you have to have all the files in the same git repo for the agent to actually be able to fetch them on the remote machine.
If you have a mix of notebooks and code, you have to have the local code in a git repo,
Make sense ?
One work around is adding:import pandas as pd import numpy as np
on the main file. This way the depency is properly detected. But idk, it seems like this shouldn't be a problem. Did you guys managed to reproduce the error?
No, I use a SSH key to give the agent access. I am sure the key is working and he is cloning properly
So for notebooks requirements are indeed not checked elsewhere.
You can however include them with using this line before Task.init
Task.force_requirements_env_freeze(requirements_file=requirements.txt)
I found that the original main code can detect the dependencies when running in a .py file. However, when running in a jupyter notebook .ipynb, it cannot.
I added a requirements.txt file on the same lvl of main.ipynb. But it still didn't detect the dependency and resulted in a importerror for pandas
You can find more info here: https://clear.ml/docs/latest/docs/references/sdk/task#taskforce_requirements_env_freeze
Well seems like you have a solution for now?
If you still want to run it as a notebook, the following should make pip install the required packages:
import sys !{sys.executable} -m pip install -r requirements.txt
I'll check if this something we need to update in our documentation or if it's a bug.