-
Components anyway need to be available when you define the pipeline controller/decorator, i.e. same codebaseNo you an specify a different code base, see here:
None -
The component code still needs to be self-composed (or, function component can also be quite complex)Well it can address the additional repo (it will be automatically added to the PYTHONPATH), and you can add auxilary functions (as long as they are part of the initial pipeline script), by passing them to
helper_functions
None -
Decorators do not allow any dynamic build, because you must know how the component are connected at decoration timeWell this is like any other python code, you define the functions before you use them, but you do Not have to use them (this is the pipeline logic itself driving it). Like Any other python code, if you do not call a function (decorated one) it will not be executed.
With that said, it could be that the provided examples are overly simplistic.
For sure!
heck results before deciding to continue, ... have adjustable loops and parallelization depending on arguments
,
None
Rephrased as:
X_train, X_test, y_train, y_test, some_value = step_two(data_frame)
if int(some_value) > 1337:
print("this is something special here, let's train another model")
model = step_four(X_train*2, y_train*2)
else:
print('launch step three')
model = step_three(X_train, y_train)
This code will be executed just like regular python function, only the return values are deferred, when the code casts to int (here explicitly so it is easier to see), the code execution wait for the function (component) to complete execution (on another machine), fetch the return value, test against the result and decide what to do.
Does that make sense ? basically python execution on multi-node in a transparent way (in scale)