Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
How Can I Ensure Tasks In A Pipeline Have The Same Environment As The Pipeline Itself? It Seems A Bit Counter-Intuitive That The Pipeline (Executed Remotely) Captures The Local Environment, But The Tasks (Executed Remotely) Do Not Use That Same Environmen

How can I ensure tasks in a pipeline have the same environment as the pipeline itself? It seems a bit counter-intuitive that the pipeline (executed remotely) captures the local environment, but the tasks (executed remotely) do not use that same environment?

  
  
Posted 2 years ago
Votes Newest

Answers 42


… And it’s failing on typing hints for functions passed in pipe.add_function_step(…, helper_function=[…]) … I guess those aren’t being removed like the wrapped function step?

  
  
Posted 2 years ago

It is installed on the pipeline creating the machine.
I have no idea why it did not automatically detect it 😞

  
  
Posted 2 years ago

The only thing I could think of is that the output of pip freeze would be a URL?

  
  
Posted 2 years ago

I’ve tracked it down further, it seems the pigar utility does not apply any smart logic there.
The case we have is the following -

  • We have a monorepo, but all modules/libs share a common namespace foo ; so e.g. working on module mod , we use from foo.mod import …
  • This then looks for a module called foo , even though it’s just a namespace
  • In the dist-info requirement, it seems any hyphen, dot, etc are swapped for an underscore, so our site-packages represents this as foo_mod-x.y.z-distinfo
  • This ends showing the available package is foo_mod
  • Specifically since foo is not generated, it is assumed local and dropped 🤔
  
  
Posted 2 years ago

We have an internal mono-repo and some of the packages are required - they’re all available correctly for the controller, only some are required for the individual tasks, but the “magic” doesn’t happen 😞
That is, the controller does not identify them as a requirement, so they’re not installed in the tasks environment.

  
  
Posted 2 years ago

I think this is the main issue, is this reproducible ? How can we test that?

  
  
Posted 2 years ago

How or why is this the issue? I great something is getting lost in translation :D
On the local machine, we have all the packages needed. The code gets sent for remote execution, and all the local packages are frozen correctly with pip.
The pipeline controller task is then generated and executed remotely, and it has all the relevant packages.
Each component it launches, however, is missing the internal packages available earlier :(

  
  
Posted 2 years ago

Still; anyone? 🥹 @<1523701070390366208:profile|CostlyOstrich36> @<1523701205467926528:profile|AgitatedDove14>

  
  
Posted 2 years ago

We also wanted this, we preferred to create a docker image with all we need, and let the pipeline steps use that docker image

That way you don’t rely on clearml capturing the local env, and you can control what exists in the env

  
  
Posted 2 years ago

So a missing bit of information that I see I forgot to mention, is that we named our packages as foo-mod in pyproject.toml . That hyphen then get’s rewritten as foo_mod.x.y.z-distinfo .

foo-mod @ git+
  
  
Posted 2 years ago

what format should I specify it

requirements.txt format e.g. ["package >= 1.2.3"]

Would this enforce that package on various components

This is a per component control, so you can have different packages / containers based on the componnent

Would it then no longer capture import statements?

This is replacing the auto detected packages, but obviously this fails to detect your internal repo package, which is the main issue here.
How is "internal package" installed, in other words can you send the pip freeze of th machine creating the pipeline ? because this is where the packages are detected (if packages are not installed you cannot infer the actual package name nor the version just from the import statement)

  
  
Posted 2 years ago

I have no idea what’s the difference, but it does not log the internal repository 😞 If I knew why, I would be able to solve it myself… hehe

  
  
Posted 2 years ago