Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi Again, I Am Trying To Execute A Pipeline Remotely, However I Am Running Into A Problem With The Steps That Require A Local Package. Basically I Have A Repo, That I Created Specifically For This Pipeline And I Have Packaged It So That I Can Split It I

Hi again,

I am trying to execute a pipeline remotely, however I am running into a problem with the steps that require a local package.

Basically I have a repo, that I created specifically for this pipeline and I have packaged it so that I can split it into different modules. Below is my directory structure:

Name
 .
├──  models
├──  src
│ ├──  ap_pipeline
│ │ ├──  scripts
│ │ ├──  init.py
│ │ ├──  base_transformer.py
│ │ ├──  constants.py
│ │ ├──  features.py
│ │ └──  filters.py
├──  tests
│ ├──  integration
│ │ └──  test_ap_pipeline_integration.py
│ └──  unit
│ └──  test_ap_pipeline.py
├──  CHANGELOG.md
├──  clearml_pipeline.py
├──  docker-compose.dev.yaml
├──  Dockerfile
├──  Dockerfile.dev
├──  Makefile
├──  pyproject.toml
├──  pytest.ini
├──  README.md
├──  requirements.txt
└──  setup.cfg

The script for the clearml_pipeline is in my root directory, however it refereces functions found in the ap_pipeline directory.

Usually I use an editable install using pip install -e . in my virtual environment and that allows it to work locally. The problem arises when trying to run remotely since clearml does not install the current repo. One way to deal with this is to have all the functions called in the pipeline and the pipeline itself in one file, but that seems like an antipattern.

How would you usually tackle this?

  
  
Posted 10 months ago
Votes Newest

Answers 13


The second problem that I am running into now, is that one of the dependencies in the package is actually hosted in a private repo.

Add your private repo to the extra index section in the clearml.conf:
None

  
  
Posted 10 months ago

Hi @<1523701205467926528:profile|AgitatedDove14> ,
Thank you for your prompt response.

I am using the functional pipeline API to create the steps. Where each step calls a function. My functions are stored in the files under the ap_pipeline directory ( filters.py , features.py , etc..)

These are packaged as part of this repo.

The modules are imported inside of the clearml_pipeline.py so it would look something like:

from ap_pipeline.features import func1, func2 ....
This works locally since ap_pipeline is installed using pip install -e . Which installs the repo as an editable install.

The pipeline installs all the dependencies except the ap_pipeline (this repo) which then causes the pipeline to fail as it will say that it can't find the module ap_pipeline

  
  
Posted 10 months ago

The extra_index_url is not even showing..

  
  
Posted 10 months ago

the agent does not auto-refresh the configuration, after a conf file change you should restart the agent, after that it should present the new configuration when loading

  
  
Posted 10 months ago

Thanks @<1523701205467926528:profile|AgitatedDove14> restarting the agents did the trick!

  
  
Posted 10 months ago

I set my local laptop as an agent for testing purposes. I run the code on my laptop, it gets sent to the server which sends it back to my laptop. So the conf file is technically on the worker right?

  
  
Posted 10 months ago

Ohh, thanks! Will give it a shot now!

  
  
Posted 10 months ago

@<1523701205467926528:profile|AgitatedDove14> So I was able to get it to pull the package by defining packages= None

The second problem that I am running into now, is that one of the dependencies in the package is actually hosted in a private repo.

I tried getting around it by defining the environment PIP_INDEX_URL and passing it using log_os_environments in the clearml.conf and I am now getting this message:

md-ap-feature-engineering/.venv/lib/python3.11/site-packages/clearml_agent/external/requirements_parser/parser.py:49: UserWarning: Private repos not supported. Skipping.
  warnings.warn('Private repos not supported. Skipping.')
  
  
Posted 10 months ago

I added the following to the

clearml.conf

file

the conf file that is on the worker machine ?

  
  
Posted 10 months ago

How do you handle private repos in clearml for packages?

  
  
Posted 10 months ago

I would just add git+ None to your requirements (either in the requirements.txt or even better as part of the pipeline/component where you also specify the repo to be used)
The agent will automatically push the crednetilas when it installs the repo as wheel.
wdyt?
btw: you might also get away with adding -e . into the requirements.txt (but you will need to test that one)

  
  
Posted 10 months ago

Hi @<1523701168822292480:profile|ExuberantBat52>

I am trying to execute a pipeline remotely,

How are you creating your pipeline? and are you referring to an issue with the pipeline logic or is it a component that needs that repo installed ?

  
  
Posted 10 months ago

I added the following to the clearml.conf file

agent {
    package_manager: {
        # supported options: pip, conda, poetry
        type: pip,
        extra_index_url: ["my_url"],
    },
}

For some reason the changes were not reflected, here are the logs from the agent:

agent.package_manager.type = pip
agent.package_manager.pip_version.0 = <20.2 ; python_version < '3.10'
agent.package_manager.pip_version.1 = <22.3 ; python_version >\= '3.10'
agent.package_manager.system_site_packages = false
agent.package_manager.force_upgrade = false
agent.package_manager.conda_channels.0 = pytorch
agent.package_manager.conda_channels.1 = conda-forge
agent.package_manager.conda_channels.2 = defaults
agent.package_manager.priority_optional_packages.0 = pygobject
agent.package_manager.torch_nightly = false
agent.package_manager.poetry_files_from_repo_working_dir = false
  
  
Posted 10 months ago
719 Views
13 Answers
10 months ago
10 months ago
Tags
Similar posts