Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
What Is

What is set_base_docker() equivalent for PipelineController ?

  
  
Posted 3 years ago
Votes Newest

Answers 19


I launch everything in docker mode, and since it builds an image on every run, it builds default nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04 image, which incurs heavy overhead. What if I want to give it my custom lightweight image instead? The same way I do for all individual tasks

  
  
Posted 3 years ago

PipelineContoller._task.set_base_docker ?? :good-thinking:

  
  
Posted 3 years ago

MelancholyElk85 Hi!
You can use a custom image docker if it's on docker hub to reduce overhead. Regarding set_base_docker() equivalent for PipelineController , let me check

  
  
Posted 3 years ago

MelancholyElk85 , did PipelineContoller._task.set_base_docker work?

  
  
Posted 3 years ago

of course, I use custom images all the time, the question was how to do it for a pipeline 😆 setting private attributes directly doesn't look as good practice

  
  
Posted 3 years ago

MelancholyElk85 , fair point 🙂

How do you initialize your tasks?

  
  
Posted 3 years ago

MelancholyElk85 if you're using add_function_step() it has a 'docker' parameter. You can read more here:
https://clear.ml/docs/latest/docs/references/sdk/automation_controller_pipelinecontroller#add_function_step

  
  
Posted 3 years ago

I initialize tasks not as functions, but as scripts from different repositories, with different images

  
  
Posted 3 years ago

cloning base tasks and modyfing their parameters

  
  
Posted 3 years ago

but the question was about the pipeline controller, not individual tasks.

  
  
Posted 3 years ago

After I set base docker for pipeline controller task, I cannot clone the repo...

  
  
Posted 3 years ago

MelancholyElk85

After I set base docker for pipeline controller task, I cannot clone the repo...

What do you mean by that?
Also, how do you set the PipelineController base_docker_image (I'm assuming the is needed to run the pipeline logic?!, is that correct?)

  
  
Posted 3 years ago

BTW: if you need you can do the following:
` from clearml import Task
from clearml.automation import PipelineController

task = Task.init(project_name='pipelines', task_name='pipeline test')
task.set_base_docker(...)

the pipeline object is using the Current Task, hence docker image is set

pipe = PipelineController(...)

pipe.start() `

  
  
Posted 3 years ago

AgitatedDove14
`
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
error: Could not fetch origin
Repository cloning failed: Command '['git', 'fetch', '--all', '--recurse-submodules']' returned non-zero exit status 1.
clearml_agent: ERROR: Failed cloning repository.

  1. Make sure you pushed the requested commit:
    (repository='git@...', branch='main', commit_id='...', tag='', docker_cmd='registry.gitlab.com/...:...', entry_point='pipe.py', working_dir='.')
  2. Check if remote-worker has valid credentials [see worker configuration file] `
  
  
Posted 3 years ago

I found out this happens with any other image except the default one, regardless of whether I set it with pipe._task.set_base_docker

The image is not needed to run the pipeline logic, I do it just to reduce overhead. Otherwise it would take too long to just build the default image on every launch

  
  
Posted 3 years ago

So, to summarize:
PipelineController works with default image, but it incurs overhead 4-5 min It doesn't work with any other image
I can add issue on Github

  
  
Posted 3 years ago

PipelineController works with default image, but it incurs overhead 4-5 min

You can try to spin the "services" queue without docker support, if there is no need for containers it will accelerate the process.

Repository cloning failed: Command '['git', 'fetch', '--all', '--recurse-submodules']' returned non-zero exit status 1.

This error is about failing to clone the pipeline code repo, how is that connected to changing the container ?!
Can you provide the full log?

  
  
Posted 3 years ago

You can try to spin the "services" queue without docker support, if there is no need for containers it will accelerate the process.

With pipe.start(queue='services') , it still tries to run some docker for some reason
1633799714110 kirillfish-ROG-Strix-G512LW-G512LW info ClearML Task: created new task id=a4b0fbc6a1454947a06be4e48eda6740 ClearML results page: `
1633799714974 kirillfish-ROG-Strix-G512LW-G512LW info ClearML new version available: upgrade to v1.1.2 is recommended!

1633799726152 kirillfish-ROG-Strix-G512LW-G512LW info 2021-10-09 20:15:26,151 - clearml.Task - INFO - Waiting to finish uploads
1633799727482 kirillfish-ROG-Strix-G512LW-G512LW info 2021-10-09 20:15:27,482 - clearml.Task - INFO - Finished uploading
1633799731889 clearml-services INFO task a4b0fbc6a1454947a06be4e48eda6740 pulled from 10cb3fafea4940e8923adad408c23ab4 by worker clearml-services

1633799731967 clearml-services INFO Running Task a4b0fbc6a1454947a06be4e48eda6740 inside default docker: arguments: []

1633799732452 clearml-services INFO Executing: ['docker', 'run', '-t', '-l', 'clearml-worker-id=clearml-services:service:a4b0fbc6a1454947a06be4e48eda6740', '-l', 'clearml-parent-worker-id=clearml-services', '-e', 'NVIDIA_VISIBLE_DEVICES=none', '-e', 'CLEARML_WORKER_ID=clearml-services:service:a4b0fbc6a1454947a06be4e48eda6740', '-e', 'CLEARML_DOCKER_IMAGE=', '-v', '/tmp/.clearml_agent.pgsygoh2.cfg:/root/clearml.conf', '-v', '/root/.clearml/apt-cache:/var/cache/apt/archives', '-v', '/root/.clearml/pip-cache:/root/.cache/pip', '-v', '/root/.clearml/pip-download-cache:/root/.clearml/pip-download-cache', '-v', '/root/.clearml/cache:/clearml_agent_cache', '-v', '/root/.clearml/vcs-cache:/root/.clearml/vcs-cache', '--rm', '', 'bash', '-c', 'echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/docker-clean ; chown -R root /root/.cache/pip ; export DEBIAN_FRONTEND=noninteractive ; export CLEARML_APT_INSTALL="$CLEARML_APT_INSTALL libsm6 libxext6 libxrender-dev libglib2.0-0" ; [ ! -z $(which git) ] || export CLEARML_APT_INSTALL="$CLEARML_APT_INSTALL git" ; declare LOCAL_PYTHON ; for i in {10..5}; do which python3.$i && python3.$i -m pip --version && export LOCAL_PYTHON=$(which python3.$i) && break ; done ; [ ! -z $LOCAL_PYTHON ] || export CLEARML_APT_INSTALL="$CLEARML_APT_INSTALL python3-pip" ; [ -z "$CLEARML_APT_INSTALL" ] || (apt-get update && apt-get install -y $CLEARML_APT_INSTALL) ; [ ! -z $LOCAL_PYTHON ] || export LOCAL_PYTHON=python3 ; $LOCAL_PYTHON -m pip install -U "pip<20.2" ; $LOCAL_PYTHON -m pip install -U clearml-agent ; cp /root/clearml.conf /root/default_clearml.conf ; NVIDIA_VISIBLE_DEVICES=none $LOCAL_PYTHON -u -m clearml_agent execute --full-monitoring --id a4b0fbc6a1454947a06be4e48eda6740'] `

This error is about failing to clone the pipeline code repo, how is that connected to changing the container ?!
Can you provide the full log?

I reset this task, will reproduce later

  
  
Posted 3 years ago

With 

pipe.start(queue='services')

, it still tries to run some docker for some reason

The services agent is always running with --docker:
https://github.com/allegroai/clearml-agent/blob/e416ab526ba9fe05daa977b34c9e46b50fb214a0/docker/services/entrypoint.sh#L16
Actually I think we should have it as an argument, so it is easier to control from docker-compose

I'll be waiting for the full log to check the "git clone" issue

  
  
Posted 3 years ago
984 Views
19 Answers
3 years ago
one year ago
Tags
Similar posts