I am writing quite a bit of documentation on the topic of pipelines. I am happy to share the article here, once my questions are answered and we can make a pull request for the official documentation out of it.
Amazing please share once done, I will make sure we merge it into the docs!
Does this mean that within component or add_function_step I cannot use any code of my current directories code base, only code from external packages that are imported - unless I add my code with
helper_functions
?
Yes, I'll try to improve the docstring there.
- It is important to realize that each decorated funciton will end up packaged in a spereate script file, and that script file will be running on the remote machine
- To the above script you can add a repo, so that script file is running inside the repo.
- But let's assume that in the first script we want more than just the decorated funciton, aha! we add the additional functions in the
helper_functions
arguments, and these funcitons will also be part of the standalone script file with our component. does that make sense @<1523704157695905792:profile|VivaciousBadger56> ?
If I do
not
build a package out of my local repository/project , I cannot reference anything
No need to build a packge from the repo, just pass it to as the repo
args.
So for example:
@PipelineDecorator.component(return_values=['accuracy'], cache=True, task_type=TaskTypes.qc, repo="
")
def step_four(model, X_data, Y_data):
print("yey")
What will happen is the agent will pull the " None " into a target folder (say ~/code) then it will create a new file called "step_four.py" and add that to the same ~/code
folder.
Then it will run something like cd ~/code && PYTHONPATH=~/code python step_four.py
Make sense ?