Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi All, Looking For Some Help When Executing Pipelines With Custom Docker Images. I Have A Component Defined And I Expect Its Python Runtime Environment To Be Managed By A Custom Docker Image (

Hi all,

Looking for some help when executing pipelines with custom Docker images.

I have a component defined and I expect its Python runtime environment to be managed by a custom docker image ( foobar ):
@PipelineDecorator.component(docker='foobar', ...)
As a result, I don’t want the Agent to parse what imports are being used / install dependencies whatsoever — assumption should be that the Python runtime environment is already handled by the Docker image.

How can I achieve this? Thanks!

  
  
Posted 2 years ago
Votes Newest

Answers 13


Thank you! I adjusted my pipeline logic so that the component used packages=[]

Funny enough I’m running into a new issue now. Does this mean I need to configure the Agent’s runtime environment so it has the necessary dependencies to execute Pipeline script?
` # Agent Logs
Starting Task Execution:

Traceback (most recent call last):
File "/Users/developer/.clearml/venvs-builds/3/code/train_and_evaluate.py", line 1, in <module>
from clearml import Task, TaskTypes
ModuleNotFoundError: No module named 'clearml' $ head -10 ~/.clearml/venvs-builds/3/code/train_and_evaluate.py
from clearml import Task, TaskTypes
from clearml.automation.controller import PipelineDecorator

def train_and_evaluate():
_train_and_evaluate()

if name == 'main':
task = Task.init() `

  
  
Posted 2 years ago

If this is the case, there is nothing you need to change, just provide the docker image (no need to pass packages )

  
  
Posted 2 years ago

Could it be these packages (i.e. numpy etc) are not installed as system packages in the docker (i.e. inside a venv, inside the docker) ?

  
  
Posted 2 years ago

Funny enough I’m running into a new issue now.

Sorry my bad, I thought have known 😉 yes it probably should be packages=["clearml==1.1.6"]
BTW: do you have any imports inside the pipeline function itself ? if you do not, then no need to pass "packages" at all, it will just add clearml

  
  
Posted 2 years ago

Just threw a new file into the gist above

It doesn’t look like it even gets to the point where it installs from the numpy wheel (because it errors out installing Pillow elsewhere).

What’s interesting to me (as a ClearML newbie) is it’s clearly compiling that wheel using my host machine (MacOS).

I would have expected there to be separation between the “pipeline runner” if you will and the task. I would expect the pipeline runner to only need a dependency on ClearML and for the task to be spawned as a container with numpy installed (Linux in this case)

  
  
Posted 2 years ago

Hmm maybe different numpy version? ( numpy==1.22.1 maybe the Task needs a diff version) ? Can you post the Task log ?

  
  
Posted 2 years ago

For anyone following along, my lesson was configuring the clearml-agent daemon with the --docker flag to instruct it to spawn tasks in containers (and using the docker arg passed through to my Pipeline component)

  
  
Posted 2 years ago

Hi WickedStarfish97

As a result, I don’t want the Agent to parse what imports are being used / install dependencies whatsoever

Nothing to worry about here, even if the agent detects the python packages, they are installed on top of the preexisting packages inside the docker. That said if you want to over ride it, you can also pass packages=[]

  
  
Posted 2 years ago

Right, my only complaint is it appears to be using cached wheels and building them (for packages like numpy , scipy , etc) even though numpy is available in the Python runtime env

  
  
Posted 2 years ago

What’s interesting to me (as a ClearML newbie) is it’s clearly compiling that wheel using my host machine (MacOS).

Hmm kind of, and kind of not.
If you take a look at the Tasks created (regardless on how they are created,. pipeline, manually, etc.), you have a list of python packages required by the code, as they are detected at runtime (i.e. when the code was first executed, on the development machine). When creating a Pipeline controller (runner), the pipeline Tasks are just lists, and package version are listed based on the Machine running the initial pipeline (in your case Mac), the reason is so at least we have a version pf the packages (if exist) that will be working for you Yes you are correct, there should not be a connection between the runner machine and the remote machine, that said we do want to be able to specify the required packages and usually python packages are available on most OS distro. If we were not auto-detecting them, then you would have had to specify them manually, which you can also do and it will override the packages it detected. Does that make sense ?

Just threw a new file into the gist above

Not sure what I'm seeing there, but it definitely does not include the error.
If it helps you can DM me the full log (btw: all pass/secrets are automatically masked from the log, but I would double chech just in case 😉 )

  
  
Posted 2 years ago

Even if you had any packages, I'm pretty sure there is nothing for you to worry about, it will just list them, and if they are preinstalled, the preinstalled will be used

  
  
Posted 2 years ago

Thanks, my pipeline script only takes a dependency on clearml as well as an internal library (local Python module installed into the Docker image) that provides the _train_and_evaluate function as seen above

  
  
Posted 2 years ago