Scenario 1 & 2 are essentially the same from caching perspective (the face B != B` means they have different caching hashes, but in both cases are cached).
Scenario 3 is the basically removing the cache flag from those components.
Not sure if I'm missing something.
Back to the @<1523701083040387072:profile|UnevenDolphin73>
From decorators - when the pipeline logic is very straightforward ...
Actually I would disagree, the decorators should be used when the pipeline Logic is not a DAG, the component itself can be extremely complex, and the decorator function is just a way to start the "main" of the component, that can rely on a totally different codebase. The main difference in both Tasks & functions the pipeline logic is actually a DAG, where as with decorators the logic is free python code! this is really a game changer when you think about the capabilities, you can check results before deciding to continue, you can have adjustable loops and parallelization depending on arguments etc.
Last point on component caching, what I suggest is actually providing users the ability to control the cache "function". Right now (a bit simplified but probably accurate), this is equivalent to hashing of the following dict:
{"code": "code here", "container": "docker image", "container args": "docker args", "hyper-parameters": "key/value"}
We could allow users to add a function that get's this dict and returns a new dict that will be used for hashing. This way we will enable removing or changing of fields, like ignoring code, or some of the arguments, and having the ability to add new custom fields.
wdyt?