1. Backstory
There have been many attempts to deferred evaluation so far and as far as I have seen the following 3 cases are at least in the top 5 that it is aiming to address:
- lazy imports
- evaluation graphs
- lazy defaults
Although lazy imports (1) can be addressed via deferred evaluation concept, it is most likely not the best path as they are subject to import machinery and there are likely nuances which could not be addressed with deferred evaluation.
Although evaluation graphs (2) could be done via deferred evaluation, they have many possible features that would be difficult to incorporate into deferred evaluation approach. Such features are (but not limited to) manipulating graph before evaluating for optimization purposes (e.g. dask
), parallel execution. Thus, again, such, in my opinion is a separate case, which would most likely require its own thing if was to be addressed in standard library.
Lazy defaults (3) would be perfectly handled via deferred evaluation. However:
- The issues of the last (and pretty much the only proper) attempt has pretty much nailed everything down that was manageable within reasonable effort and is now stuck on key issues, which are not straight forward to address. See: Backquotes for deferred expression
- It is a major overkill to implement deferred evaluation for this sole purpose.
2. Proposal: builtins.lazy
for lazy arguments.
So regardless of my opinions above about (1) and (2), this proposal is to address lazy defaults (3).
The aim of this is to have more convenient way to do:
FAIL = object()
result = {}.get('a', default=FAIL)
if result is FAIL:
result = math.factorial(100_000)
, which, as far as I have seen, is currently most natural and robust approach, which can be used in any place.
I have used this for a fair while and will continue using it as it does deal with the problem well with minimal complexity.
Suggestion is to implement builtins.lazy
as:
class lazy:
def __init__(self, func, *args, **kwds):
self.func = func
self.args = args
self.kwds = kwds
def __call__(self):
return self.func(*self.args, **self.kwds)
So it can be recognised in various places across the standard library and open source packages as:
class dict:
def get(self, key, default=None):
if key in self:
return self[key]
elif isinstance(default, lazy):
default = default()
return default
result = {'key': 1}.get('key', default=lazy(math.factorial, 500_000))
print(result) # 1
Also, it need not necessarily be in builtins
, it could as well be in functools
, but if this was to be implemented to say dict.get
, then builtins
seems a bit more natural place for it.
3. Alternatives
3.1. Just use lambda
Although this works in many cases, this is not robust approach for libraries that implement generic tools. E.g. such would not be suitable to implement for defaults.dict.get
. Reason being is that this prohibits default to be lambda, which:
a) breaks backwards compatibility if was to be implemented to existing methods
b) is just not a good idea, because why should lambda
be incorrect default value? E.g.:
class dict:
def get(self, key, default=None):
if key in self:
return self[key]
elif callable(default) and default.__name__ == '<lambda>':
default = default()
return default
callback_dict = {'a': lambda: 1, 'b': lambda: 2}
callback = callback_dict.get(default=lambda: 3)
print(callback) # 3, while I would like to get back (lambda: 3) as it is.
3.2. Let users define one for themselves. Why implement it to standard library?
Few reasons:
- Consistency - easy to learn and remember
- If it is not implemented into standard library, standard library objects will not have this (which is an important part of this)
- If it is not implemented in standard library and its objects, it is unlikely to become a standard practice.
- For some cases pure python class might be a bit too slow:
d = {}
%timeit d.get('k') # 32 ns
%timeit lazy(math.factorial, 100_000) # 321 ns
%timeit partial(math.factorial, 100_000) # 181 ns
So the object construction should ideally be as efficient as possible and being 10x or even 5x slower than dict.get
might not be attractive for cases of frequent/iterative usage with high hit ratio.
The fastest one that is available is partial
. Which is the one that I am currently using as:
class lazy(partial):
pass
However, dedicated implementation would be significantly faster as partial
does many things that this does not require. I suspect it can be made to be not that much slower than:
%timeit object() # 70 ns
I would say <= 100 ns is fairly likely outcome. The __call__
would be faster than the one of the partial
as well.
3.3. Some generic DSL
While it can be made generic via some DSL (e.g. along the lines of some ideas in DSL Operator – A different approach to DSLs
result = lazy(t'{d.get}({key},default={lambda: math.factorial(100_000)})')
), such approach is more suitable for domain specific DSLs as opposed to widely used features.
It is also unlikely it would be possible to achieve good performance via this approach.
3.4. Utility function for 1 lazy default argument
This is a possibility. Instead of a “builtin flag object”, a some functools.lazyargs
could be made. E.g.:
class lazydefault:
def __init__(self, arg):
self.arg = arg
def __call__(self, func, *args, **kwds):
FAIL = object()
if isinstance(self.arg, int):
args = list(args)
dflt = args[self.arg]
args[self.arg] = FAIL
else:
dflt = kwds[self.arg]
kwds[self.arg] = FAIL
result = func(*args, **kwds)
if result is FAIL:
result = math.factorial(100_000)
return result
lazydefault(1)({}.get, 'a', lambda: 1)
The benefit of this is that it could be used in any place without changes to methods. However, it isn’t as convenient as proposed approach and implementation is fairly hefty in comparison.
Also, this does not offer intrinsic capability to bind arguments.
Thus, even if this existed, builtins.lazy
would still be complementary. E.g.:
lazydefault('default')({}.get, 'a', default=lazy(factorial, 100_000))
Although performance of such implementation is unlikely to be very good, similar utility can be useful as it could be used for methods with defaults, where builtins.lazy
has not yet been implemented.
Finally, this approach only covers one case - “one argument of default return value”, while builtins.lazy
is a more general concept, which, although needs to be implemented to specific case, is suitable for arbitrary number of lazy arguments and is not bound to specific use case.
4. To sum up
This proposal offers a simple method to generalise the case of lazy arguments.
It also hints at possibility of incorporating this into existing methods of standard library objects, such as Mapping.get
.
Would be interested to hear what others think abut this.
Related:
- Dynamic evaluation of function argument list initializer
- This addresses cases that are different to the ones that PEP 671 – Syntax for late-bound function argument defaults | peps.python.org aims to address