Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, I Have An Issue When Running A Pipeline Controller Remotely In Docker. Basically I Have A Module That Reads A Config File Into A Dict And Calls The Pipeline Controller, Like

Hi, I have an issue when running a pipeline controller remotely in docker. Basically I have a module that reads a config file into a dict and calls the pipeline controller, like python -m my_pipeline --config ./config.yml . The pipeline controller then passes the dict config to other pipeline components. If I set start_controller_locally=True , everything works fine, the steps are run in the docker container in the remote machine with the correct config. However, if I set start_controller_locally=False , then the pipeline fails because it runs python -m my_pipeline --config ./config.yml instead of just the controller function and tries to read ./config.yml , which is not available in the docker container. Is that the correct behavior? I would expect for it to run only the controller function with the dict config, as in the components.

  
  
Posted 10 months ago
Votes Newest

Answers 11


For instance, I have in my_pipeline/__main__.py :

import yaml
import argparse
from my_pipeline.pipeline import run_pipeline

parser = argparse.ArgumentParser()
parser.add_argument('--config', type=str, required=True)

if __name__ == '__main__':
    args = parser.parse_args()
    with open(args.config) as f:
        config = yaml.load(f, yaml.FullLoader)
    run_pipeline(config)

and in my_pipeline/pipeline.py :

@PipelineDecorator.pipeline(
    name='Main',
    project=None,
    default_queue='default',
    pipeline_execution_queue='default',
    start_controller_locally=False,
    repo='
',
    add_run_number=False)
def run_pipeline(config: Dict):
    print(config)

I'm running this on an agent in docker mode

  
  
Posted 10 months ago

Hi @<1570220858075516928:profile|SlipperySheep79> , I think it depends on your code. Can you provide a self contained code snippet that reproduces this?

  
  
Posted 10 months ago

basically, I think that the pipeline run starts from __ main_ _ and not the pipeline function, which causes the file to be read

  
  
Posted 10 months ago

Hi @<1570220858075516928:profile|SlipperySheep79> ! What happens if you do this:

import yaml
import argparse
from my_pipeline.pipeline import run_pipeline
from clearml import Task

parser = argparse.ArgumentParser()
parser.add_argument('--config', type=str, required=True)

if __name__ == '__main__':
    if not Task.current_task():
      args = parser.parse_args()
      with open(args.config) as f:
          config = yaml.load(f, yaml.FullLoader)
    run_pipeline(config)
  
  
Posted 10 months ago

Hi @<1523701087100473344:profile|SuccessfulKoala55> , I think the issue is where to put the connect_configuration call. I can't put it inside run_pipeline because it's only running remotely and it doesn't have access to the file, and I can't put it in the script before the call to run_pipeline since the task has not been initialized yet.

  
  
Posted 10 months ago

Also: what's the purpose of storing the pipeline arguments as artifacts then? When it runs remotely it still runs the main script as entrypoint and not the pipeline function directly, so all the arguments will be replaced by what is passed to the function during the remote execution, right?

  
  
Posted 10 months ago

Hi @<1523701435869433856:profile|SmugDolphin23> , I just tried it but Task.current_task() returns None even when running in remotely

  
  
Posted 10 months ago

@<1523701435869433856:profile|SmugDolphin23> then the issue is that config is not set. I also tried with:

import yaml
import argparse
from my_pipeline.pipeline import run_pipeline
from clearml import Task

parser = argparse.ArgumentParser()
parser.add_argument('--config', type=str, required=True)

if __name__ == '__main__':
    if Task.running_locally()::
      args = parser.parse_args()
      with open(args.config) as f:
          config = yaml.load(f, yaml.FullLoader)
    else:
      config = None
    run_pipeline(config)

But then it prints None , so the pipeline parameters are completly ignored

  
  
Posted 10 months ago

@<1570220858075516928:profile|SlipperySheep79> depending on a local file is always an issue - would try to connect a configuration based on this file, so that it will be loaded when running locally and than retrieved from the backend when running remotely

  
  
Posted 10 months ago

I've upladed an example here for simiplicity: None

  
  
Posted 10 months ago

How about if Task.running_locally(): ?

  
  
Posted 10 months ago
549 Views
11 Answers
10 months ago
10 months ago
Tags
Similar posts