Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Has Anyone Got Any Experience With C++ Extensions In Python When Using Clearml? In Our Setup.Py We Have:

Has anyone got any experience with C++ extensions in Python when using ClearML? In our setup.py we have:
ext_modules=[ Extension( "file_io.extio", sources=["/file_io/extio.cpp"], depends=["/file_io/samples.h"], define_macros=[("NPY_NO_DEPRECATED_API", "NPY_1_9_API_VERSION")], extra_compile_args=["-std=c++11"], libraries=["rt"] if platform.system() == "Linux" else [], include_dirs=[GetNumpyIncludeDirectoryLazy()], optional=True, ), Extension( "merge_data.merge_data", sources=["/merge_data/merge_data.cpp"], depends=["/merge_data/mnist_parser.h"], extra_compile_args=["-std=c++11"], libraries=["rt"] if platform.system() == "Linux" else [], optional=True, ), ],When installing the package within a docker container and running the training script to register a task, this all works fine. However when running the task using a clearml agent in docker mode, the c++ module fails to import, anyone have any insight? required c++ compilers seem to be installed on the docker container.

  
  
Posted 2 years ago
Votes Newest

Answers 19


this is the installation for a locally used package in the task fyi, so it's imported from the training script

  
  
Posted 2 years ago

Hi NaughtyFish36

c++ module fails to import, anyone have any insight? required c++ compilers seem to be installed on the docker container.

Can you provide log for the failed Task?
BTW: if you need build-essentials you can add it as the Task startup script
apt-get install build-essentials

  
  
Posted 2 years ago

AgitatedDove14 Yeah I added it into the initial bash script to test whether that would fix the issue. The task is created using the SDK in the model training script i.e. Task.init() . I was under the impression the local package would be installed due to replication of the environment I initialised the task under, however I've tried the add_requirements("leap") function and just seem to be getting an "isadirectory" error? I also tried manually adding leap==0.4.1 in the task UI which didn't work. The environment in the logs does show that leap is being installed potentially from a cache? - leap @ file:///opt/keras-hannd

  
  
Posted 2 years ago

AgitatedDove14 fyi I do install build-essential manually in the logs I just sent you, and it still fails

  
  
Posted 2 years ago

build-essentials didn't work unfortunately through installing it at startup

  
  
Posted 2 years ago

AgitatedDove14 Unfortunately that didn't work either, I agree that should run the setup.py correctly but something still seems to be breaking, I've sent you the most recent logs

  
  
Posted 2 years ago

So could it be that pip install --no-deps . is the missing issue ?
what happens if you add to the installed packages "/opt/keras-hannd" ?

  
  
Posted 2 years ago

I think it is to do with the build-essential issue. Let me talk you through the process:
Run a docker image locally called keras-hannd-cml (i.e. the one that is then being used by the agent as the base image later on) Run the training script to register the task, which works fine, all dependencies work i.e. the c++ packages are working correctly on that container Execute the task on an agent running in docker mode with the same image that the task was registered with i.e. keras-hannd-cml. Task fails since it's missing the C++ module somehow
i've sent you the most recent logs. Can you see anything incorrect with the above work process?

  
  
Posted 2 years ago

We have
ext_modules=[ Extension( 'leap.learn.data_tools.file_io.extio', sources=['leap/learn/data_tools/file_io/extio.cpp'], depends=['leap/learn/data_tools/file_io/samples.h'], define_macros=[('NPY_NO_DEPRECATED_API', 'NPY_1_9_API_VERSION')], extra_compile_args=['-std=c++11'], libraries=['rt'] if platform.system() == 'Linux' else [], include_dirs=[GetNumpyIncludeDirectoryLazy()], optional=True ),in our setup.py which I believe isn't being built correctly when the task is running on the agent.

Manually I was installing the leap package through python -m pip install . when building the docker container. My thinking was that when the tasks environment was then replicated on the agent, the leap package would be installed correctly through it's setup.py with the Extension which I've listed above

  
  
Posted 2 years ago

AgitatedDove14 DM's you the log file for the failed task. I have tried using a task startup script to install G++, gcc etc. but it didn't seem to work, I'll try build-essentials too. I'm also interested in the way that the environments are set up in clearml, I read in the docs that the task looks for a requirements.txt file to construct the env, but does this prevent a local package being built correctly i.e. through setup.py when running a remote task?

  
  
Posted 2 years ago

So I see this in the build, which means it works , and compiles, what is missing ?
` Building wheels for collected packages: leap
Building wheel for leap (setup.py) ... [?25l- \ |

1667848450770 UH-LPT371:0 DEBUG / - \ | / - done
[?25h Created wheel for leap: filename=leap-0.4.1-cp38-cp38-linux_x86_64.whl size=1052746 sha256=1dcffa8da97522b2611f7b3e18ef4847f8938610180132a75fd9369f7cbcf0b6
Stored in directory: /root/.cache/pip/wheels/b4/0c/2c/37102da47f10c22620075914c8bb4a9a2b1f858263021ca437
Successfully built leap
Installing collected packages: leap
Attempting uninstall: leap
Found existing installation: leap 0.4.1
Not uninstalling leap at /usr/local/lib/python3.8/dist-packages, outside environment /root/.clearml/venvs-builds/3.8
Can't uninstall 'leap'. No files were found to uninstall.
Successfully installed leap-0.4.1 `

  
  
Posted 2 years ago

and it's clearml version 1.7.2

  
  
Posted 2 years ago

Manually I was installing the

leap

package through

python -m pip install .

when building the docker container.

NaughtyFish36 what happnes if you add to your "installed packages" /opt/keras-hannd ? This should translate to "pip install /opt/keras-hannd" which seems like exactly what you want, no ?

  
  
Posted 2 years ago

AgitatedDove14 . sorry what .so are you referring to here? I can't see that in the logs. The docker image installs the package via first installing requirements i.e. RUN pip install --no-cache-dir -r /tmp/requirements.txt the repo is copied locally, and then leap is installed through RUN cd /opt/keras-hannd && pip install --no-deps . .

  
  
Posted 2 years ago

containing the

Extension

module

Not sure I follow, what is the Extension module ? what were you running manually that is not just pip install /opt/keras-hannd ?

  
  
Posted 2 years ago

AgitatedDove14 The issue seems to be that the setup.py containing the Extension module we need isn't being run in the clearml virtual environment within the docker container. What is the correct process for installing local packages so they're replicated correctly when running remotely on an agent?

  
  
Posted 2 years ago

NaughtyFish36

No module named 'leap.learn.data_tools.merge_data.merge_data'

This seems to be the error but I cannot see leap in the installed packages , Notice that if the Task has "Installed Packages" section then the agent will use that Not the "requirements.txt" , Only if this section is Empty it will revert to the "requirements.txt" in the repo.
How did you create the Task in the first place?
I see that you added "leap" into the initial bashscript, actually you should add it into the requirements with
Task.add_requirements("leap") task = Task.add_requirements

  
  
Posted 2 years ago

The point is, " leap" is proeperly installed, this is the main issue. And although installed it is missing the ".so" ? what am I missing? what are you doing manually that does Not show in the log?
In other words how did you install it "menually" inside the docker when you mentioned it worked for you when running without the agent ?

  
  
Posted 2 years ago

function and just seem to be getting an "isadirectory" error?

Can you post here what you are getting ? which clearml version are you using ?!

also tried manually adding

leap==0.4.1

in the task UI which didn't work.

That has to work, if it did not, can you send the log for the failed Task (or the Task that did not install it)?

The environment in the logs does show that leap is being installed potentially from a cache?

This is true I have double checked your logs and you are correct, it seems to be installed
So I do not get how come you get, ModuleNotFoundError: No module named 'leap.learn.data_tools.merge_data.merge_data'

Could it be you are installing the wrong version? or maybe the wrong package?
is this is the leap you need? where do you install it from?

lastly, is this still relates to the " build-essentials" issue? it seems that we are talking about a whole diff issue?!

  
  
Posted 2 years ago
1K Views
19 Answers
2 years ago
one year ago
Tags