Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hey, About The Dependency Propagation Of Pipeline Components, If I Call A Vanilla Python Function From A Component Does The Dependencies Specified In The Internal Imports Propagated To This Function Call Too ? And Additionally If That Function Is In Anoth

Hey, about the dependency propagation of pipeline components, if I call a vanilla python function from a component does the dependencies specified in the internal imports propagated to this function call too ? And additionally if that function is in another python module does ClearML dependency linker automatically upload that code too ? And is this compatible with the Task.force_store_standalone_script() option ?

  
  
Posted 2 years ago
Votes Newest

Answers 8


At least for procedures

  
  
Posted 2 years ago

Okay looks like the call dependency resolver does not supports cross-file calls and relies instead on the local repo cloning feature to handle multiple files so the Task.force_store_standalone_script() does not allow for a pipeline defined cross multiple files (now that you think of it it was kinda implied by the name), but what is interesting is that calling an auxiliary function in the SAME file from a component also raise a NameError: <function_name> is not defined , that's kinda sad.

  
  
Posted 2 years ago

I can test it empirically but I want to be sure what is the expected behavior so my pipeline don't get auto-magically broken after a patch

  
  
Posted 2 years ago

Well given a file architecture looking like this:
|_ __init__.py |_ my_pipeline.py |_ my_utils.py
With the content of my_pipeline.py being:
` from clearml.automation.controller import PipelineDecorator
from clearml import Task, TaskTypes

from my_utils import do_thing

Task.force_store_standalone_script()

@PipelineDecorator.component(...)
def my_component(dataset_id: str):
import pandas as pd
from clearml import Dataset

dataset = Dataset.get(dataset_id=input_dataset_id)
dataset_path = dataset.get_local_copy()

dataset = do_thing(dataset)
... `

And the content of my_utils.py being:
def do_thing(df: pd.DataFrame) -> pd.DataFrame: """Just a simple shuffle this is an example not a CS course""" df = df.sample(frac=1) return df
Should I do an import pandas as pd in my_utils.py given than the call of do_thing() is done within my component and thus in the scope of the component's pandas import ? Will clearML resolve that function, upload it and propagate the component from which it is called depdendencies to it?

  
  
Posted 2 years ago

Would have been great if the CLearML resolver would just inline the code of locally defined vanilla functions and execute that inlined code under the import scope of the component from which it is called

  
  
Posted 2 years ago

Well it is also failing within the same file if you read until the end, but for the cross-file issue, it's mostly because of my repo architecture organized in a v1/v2 scheme and I didn't want to pull a lot of unused files and inject github PATs that frankly lack gralunarity in the worker

  
  
Posted 2 years ago

Hi FierceHamster54 , can you please elaborate on the process with a more specific example?

  
  
Posted 2 years ago

Hi FierceHamster54 ,

I think

And is this compatible with the

Task.force_store_standalone_script()

option ?

is causing the issue, you are storing the entire script as a standalone without any git, so once you are trying to import other parts of the git, BTW any specific reason using it in your pipeline?

  
  
Posted 2 years ago
991 Views
8 Answers
2 years ago
one year ago
Tags