Last point on component caching, what I suggest is actually providing users the ability to control the cache "function". Right now (a bit simplified but probably accurate), this is equivalent to hashing of the following dict:
{"code": "code here", "container": "docker image", "container args": "docker args", "hyper-parameters": "key/value"}
We could allow users to add a function that get's this dict and returns a new dict that will be used for hashing. This way we will enable removing or changing of fields, like ignoring code, or some of the arguments, and having the ability to add new custom fields.
@<1523701205467926528:profile|AgitatedDove14> : Is the idea here the following? You want to use inversion-of-control such that I provide a function f
to a component that takes the above dict an an input. Then I can do whatever I like inside the function f
and return a different dict as output. If the output dict of f
changes, the component is rerun; otherwise, the old output of the component is used?
I would like to add, but maybe, this is what you meant all along:
It would be great if you could search - among previously executed tasks - for a task which has the same f
-output as my components f
-output and use that old task's result; then, there is no new task created from the component-definition. Only if you cannot find such a task, the component is rerun as a new task. In other words, f
is like a query for a task.
This would be an awesome and pretty streamlined feature. I like it not only because of its flexibility, but because you could get rid of other caching rules. I like it much, if an idea/concept is more general than other concepts, but also removes other concepts because of its generality.