Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi All, I Am Currently Have A Pipeline With Multiple Steps Using The Functional Api

Hi all,

I am currently have a pipeline with multiple steps using the functional api add_function_step I would like for one of the steps to be run from within a docker container as the container has some non-python applications that I need to run (using python subprocess).

    pipe.add_function_step(
        name="align_sequences",
        function=pipeline_get_alignments,
        function_kwargs={
            "X_train": "${map_sequences.X_train}",
            "X_test": "${map_sequences.X_test}",
        },
        function_return=["X_train", "X_test"],
        cache_executed_step=True,
        docker="aws_id.dkr.ecr.eu-central-1.amazonaws.com/my_docker_image:latest",
    )

What I want is to be able to run the code related to this step from within the container. I am able to see that clearml does log that I want to use docker on the web ui, as shown in the attached image.

The code breaks when trying to execute the step that requires the tool to be installed on the machine, which means that it's not executing the code from within the container.

I am trying to run the pipeline locally, could that be the problem? Am I missing something?
image

  
  
Posted 7 months ago
Votes Newest

Answers 4


Thank you so much for your reply, will give that a shot!

  
  
Posted 7 months ago

Hi again @<1523701435869433856:profile|SmugDolphin23> ,

I was able to run the pipeline remotely on an agent, but I am still facing the same problem with the code breaking on the exact same step that requires the docker container. Is there a way to debug what is happening? Currently there is no indication from the logs that it is running the code in the docker container. Here are the docker related logs:

agent.docker_pip_cache = /home/amerii/.clearml/pip-cache
agent.docker_apt_cache = /home/amerii/.clearml/apt-cache.1
agent.docker_force_pull = false
agent.default_docker.image = nvidia/cuda:11.0.3-cudnn8-runtime-ubuntu20.04
agent.enable_task_env = false
agent.hide_docker_command_env_vars.enabled = true
agent.hide_docker_command_env_vars.parse_embedded_urls = true
agent.abort_callback_max_timeout = 1800
agent.docker_internal_mounts.sdk_cache = /clearml_agent_cache
agent.docker_internal_mounts.apt_cache = /var/cache/apt/archives
agent.docker_internal_mounts.ssh_folder = ~/.ssh
agent.docker_internal_mounts.ssh_ro_folder = /.ssh
agent.docker_internal_mounts.pip_cache = /root/.cache/pip
agent.docker_internal_mounts.poetry_cache = /root/.cache/pypoetry
agent.docker_internal_mounts.vcs_cache = /root/.clearml/vcs-cache
agent.docker_internal_mounts.venv_build = ~/.clearml/venvs-builds
agent.docker_internal_mounts.pip_download = /root/.clearml/pip-download-cache
docker_cmd = 084736541379.dkr.ecr.eu-central-1.amazonaws.com/ap_pipeline:latest
entry_point = pipeline_get_alignments.py
working_dir = .

Here is my pipeline function step:

    pipe.add_function_step(
        name="align_sequences",
        function=pipeline_get_alignments,
        function_kwargs={
            "X_train": "${map_sequences.X_train}",
            "X_test": "${map_sequences.X_test}",
        },
        function_return=["X_train", "X_test"],
        cache_executed_step=True,
        tags=["intaRNA"],
        docker="aws_account_id.dkr.ecr.eu-central-1.amazonaws.com/ap_pipeline:latest",
        docker_bash_setup_script="./docker_setup_script.sh",
        packages=packages,
        execution_queue=QUEUE,
    )

What are some steps I can take to debug what is happening?

  
  
Posted 7 months ago

Nevermind, I figured out the problem. I needed to specify the --docker flag when running the clearml-agent

  
  
Posted 7 months ago

Hi @<1523701168822292480:profile|ExuberantBat52> ! During local runs, tasks are not run inside the specified Docker container. You need to run your steps remotely. To do this you need to first create a queue, then run a clearml-agent instance bound to that queue. You also need to specify the queue in add_function_step . Note that the controller can still be ran locally if you wish to do that

  
  
Posted 7 months ago
509 Views
4 Answers
7 months ago
7 months ago
Tags
Similar posts