Builtins.lazy for lazy arguments

Rosuav · May 6, 2025, 7:05am

So what you’re proposing is that identical code in different versions of Python would behave differently with regards to this lazy type. That is nightmarish for anyone who is trying to write cross-version-compatible code, and frankly, even if I were in favour of the proposal, this would scare me off. Having the semantics of valid code change while still being valid is extremely hard to handle. Imagine trying to write code that works on a version where your lazy object exists but isn’t treated specially by dict.get, and also a version where it IS treated specially.

dg-pb · May 6, 2025, 7:13am

But that is inevitable with any extension. Isn’t it?

It doesn’t matter if dict.get implements LazyType special treatment straight away or not. There will be a version where dict.get calls LazyType to get the value and there will be one that doesn’t.

In any case, it is not breaking any existing code. If one doesn’t know about LazyType, nothing changes for that user - everything works as it always did.

But if one is using the package or python standard library object which has adapted it, checks it’s docstring and sees:

def dict_get(..., default: object | LazyType | None = None):
    ...

then he can use it.

I haven’t given enough thought about the actual process, whether say dict.get / collections.UserDict.get, etc… should make use of it straight away or not.

Just pointing out that there is flexibility and the simplest path would be just to implement types.LazyType and wait to see what happens.

In other words, LazyType creation and various protocol changes do not necessarily need to be done at the same time and some buffer time can be left to reevaluate which ones (if any) of standard library should adapt it.

blhsing · May 6, 2025, 7:20am

But with your proposal, why should a lazy object be an incorrect default value?

dg-pb · May 6, 2025, 7:22am

For this there is a workaround. Luckily, one can (if needed) get a LazyType object as:

default = lazy(lambda: lazy(factorial, 10))
dict_get(d, 'key_name', default)

So the situation is less severe than say “None and the inevitable need for more sentinels”.

dg-pb · May 6, 2025, 7:30am

Not perfect, but I think given simplicity of this this compared to other solutions for this, this is acceptable drawback (especially given simple workaround).

Furthermore, LazyType is nothing but a container for (func, args, kwds). I don’t think there would be a lot of need for this in practice outside the intended usage scope, which is:

# a) create lazy object
lobj = lazy(callable, args, kwds)
# b) use it
dict.get(..., default=lobj)
# c) discard it / forget about it

User can get into complexity of nested lazy objects at his/her own risk of course.

blhsing · May 6, 2025, 7:39am

dg-pb:

Luckily, one can (if needed) get a LazyType object as:
default = lazy(lambda: lazy(factorial, 10))
dict_get(d, 'key_name', default)
So the situation is less severe than say “None and the inevitable need for more sentinels”.

But with the status quo, I can just as well pass a nested lambda to dict_get:

default = lambda: lambda: factorial(10)
dict_get(d, 'key_name', default) # which calls default if default is callable

dg-pb · May 6, 2025, 7:51am

Indeed. But there are drawbacks to this.

This would not be backwards compatible.
Unnecessary inconvenience passing callables as arguments, which is a very common occurrence. A lot of code would need to adapt to facilitate lambda lazy evaluation paradigm.
Performance. E.g. lambda check is expensive: callable(x) and x.__name__ == '<lambda>. It also needs try-except as not all callables have __name__. Creation of lambda is also more expensive. While callable check on its own is too general - unsuited for the purpose.
Value binding at evaluation is not a good property for this application. E.g. lambda: a + 2 versus lazy(opr.add, a, 2). The latter is contained within itself, while the former can not be used in places where a changes value, e.g. loop. This is quite a major inconvenience which subtracts a lot of value from this.

Rosuav · May 6, 2025, 7:59am

No, it’s a consequence of the way that your proposal creates valid semantics and then changes them.

Several versions of this proposal have floated in this thread, some of which are more able to be feature-probed and some are less. If the LazyType simply does not exist, you can safely assume that it isn’t supported, and respond accordingly. With Paul’s variant of a lazy_default additional argument, any function that doesn’t support it will simply error out when given this argument. Both of these can be probed safely. But how do you test for the situation where the type exists, yet a specific function may or may not recognize it?

You’re going through a lot of fiddliness to try to support two different notions in the same argument. I still don’t see why a simple try/except isn’t good enough, but if you absolutely have to be using dict.get, it is much less disruptive and much easier to probe if it’s done by adding a separate kwarg that takes a callable. There’s no question of changing semantics (since the prior semantics amount to raise TypeError), there’s no need to check what type of thing you’re working with (if you received a default argument, return it, and if you received a default_factory argument, call it and return the result), and it’s isolated to that function, instead of trying to create a single solution for all of Python (meaning that it can be rolled out progressively as needed, without additional disruption).

blhsing · May 6, 2025, 8:18am

dg-pb:

Indeed. But there are drawbacks to this.

This would not be backwards compatible.

Unnecessary inconvenience passing callables as arguments, which is a very common occurrence. A lot of code would need to adapt to facilitate lambda lazy evaluation paradigm.

Performance. E.g. lambda check is expensive: callable(x) and x.__name__ == '<lambda>. It also needs try-except as not all callables have __name__. Creation of lambda is also more expensive. While callable check on its own is too general - unsuited for the purpose.

Value binding at evaluation is not a good property for this application. E.g. lambda: a + 2 versus lazy(opr.add, a, 2). The latter is contained within itself, while the former can not be used in places where a changes value, e.g. loop. This is quite a major inconvenience which subtracts a lot of value from this.

Good points. I can see the value in the proposal now.

While a new separate keyword argument such as lazy_default for a lazy argument suggested by others also works to some extent, it doesn’t work well with positional arguments and template strings such as a log record template:

# nowhere to add a separate lazy argument to
assert validate_sun_rises_from_east(), expensive_verbose_info()
# wasted when logging level > DEBUG
logger.debug("Verbose info: %s", expensive_verbose_info())

dg-pb · May 6, 2025, 8:20am

handles_lazy = {}.get('', default=lazy(lambda: True)) is True

if key in dict:
    value = dict[key]
else:
    value = <lazy stuff>

Convenience. I think why people want this is not because there are no ways to achieve this, but rather that there is no convenient paradigm which brevity matches the very simple problem at hand
performance. The above needs to do 2 dict operations.

In short, the arguments for this are the same as the reasons why dict.get exists.

I have been using this approach for a long time, but it did not sit well with me. Felt like too many unnecessary arguments. Eventually I resolved for what I am proposing now.

Also, I have not seen a lot of this pattern. Given it is so basic, I am sure everyone who came across this need thought about it first. But I think there is a reason why this trend of wanting something else persists.

I think the main reason why I don’t like it is that it is simply too cumbersome. It makes a very simple and concise signature of a low level method look like some sort of mid-level interface code:

def dict_get(self, key, default=None, default_factory=None):
    assert default is None or default_factory is None
    if key in self:
        return self[key]
    if default_factory is not None:
        return default_factory()
    else:
        return default

versus:

def dict_get(self, key, default=None):
    if key in self:
        return key
    if type(default) is lazy:
        default = default()
    return default

The former, to me personally, is just not satisfactory. I like my code at the level of use cases that I have to look cleaner than that.

dg-pb · May 6, 2025, 8:22am

Well, that was a bit of a lie.

What I was using and wasn’t satisfied with was:

def dict_get(self, key, default=None, default_is_lazy=False):
    if key in self:
        return self[key]
    if default_is_lazy:
        default = default()
    return default

A bit better, but still did not sit well with me. Feels like unnecessary argument compared to what I am using now.

dg-pb · May 6, 2025, 8:36am

Well yes, this is the point really. Just to have this, one doesn’t need CPython addition and can just create his own lazy-type.

But I think it would be very convenient not needing to think about extra imports depending what library I am using. Furthermore, I want an object which is well optimized as this sort of functionality is often used iteratively and a lot of such objects need to be created throughout the runtime. Given low level methods such as dict.get this would become a significant runtime component if not implemented efficiently.

So say I use a library:

from some_library import SomeContainer, LazyObject
from other_library import OtherContainer, AnotherLazyObject

obj1 = SomeContainer()
obj2 = OtherContainer()
obj1.some_method(value_maybe_lazy=LazyObject(lambda: 1))
obj2.some_method(value_maybe_lazy=AnotherLazyObject(lambda: 1))

So now I have excessive imports.
And need to check how efficiently LazyObject is implemented if I care about this.

As opposed to:

from types import LazyType
from some_library import SomeContainer
from other_library import OtherContainer

obj1 = SomeContainer()
obj2 = OtherContainer()
obj1.some_method(value_maybe_lazy=LazyType(lambda: 1))
obj2.some_method(value_maybe_lazy=LazyType(lambda: 1))

Now I am confident that I am using quality object and there is no unnecessary repetition.

In short, this is defined well enough to have one good object type in standard library to serve these cases.

dg-pb · May 6, 2025, 8:37am

And of course having this in standard library would allow this to be adapted in standard library methods…

Rosuav · May 6, 2025, 8:45am

dg-pb:

Rosuav:

I still don’t see why a simple try/except isn’t good enough
if key in dict:
    value = dict[key]
else:
    value = <lazy stuff>
Convenience. I think why people want this is not because there are no ways to achieve this, but rather that there is no convenient paradigm which brevity matches the very simple problem at hand

performance. The above needs to do 2 dict operations.

Not sure which part of that looks like a try/except in your book. To avoid potential race conditions, you should instead write it as:

try: value = dict[key]
except KeyError: value = <lazy stuff>

This is potentially also faster, on account of doing only one dictionary operation, but the important thing is that there’s no TOCTOU issues.

This is the baseline against which your proposal needs to be considered. Is what you’re writing better than this? As you say, it’s not “there are no ways to achieve this” ^[1], but the level of convenience, which should match how frequently this is used.

So, how often IS this used? How often do you really need something that can’t be done with dict.get, and for which the try/except is too clunky?

that’s true of nearly any proposal though - Python IS Turing-complete ↩︎

dg-pb · May 6, 2025, 8:48am

Chris Angelico:

Not sure which part of that looks like a try/except in your book. To avoid potential race conditions, you should instead write it as:
try: value = dict[key]
except KeyError: value = <lazy stuff>

This does not work for defaultdict, ChainMap and similar.
In short, there are exceptions, which makes it unsuitable to be “… one-- and preferably only one --obvious way to do it.”

Rosuav · May 6, 2025, 8:50am

Give an example. I’m tired of trying to figure out generalities without actual examples. What is the actual code that you’re trying to fix here?

dg-pb · May 6, 2025, 8:53am

I am not trying to fix any code. I am proposing a standard utility to fit a group of problems. The one which is covers well the full scope without any inconveniences or hard issues. Something that could potentially “be one-- and preferably only one --obvious way to do it.”

a = collections.defaultdict(int)
if key in a:
    value = a[key]
else:
    value = math.factorial(1000)

Rosuav · May 6, 2025, 8:54am

Is this supposed to be a defaultdict of factorials?

a = collections.defaultdict(math.factorial)

This is still not a good example. You haven’t explained why this is a common thing that you need. This seems completely arbitrary.

Rosuav · May 6, 2025, 8:56am

Oops my bad, you need to use the slightly longer-hand form since defaultdict omits the argument. Still, you’re not using defaultdict’s features here at all, making it a poor example. If I wanted a defaultdict of factorials, this is how I’d write it:

class Factorials(dict):
    __missing__ = math.factorial
a = Factorials()

dg-pb · May 6, 2025, 8:59am

This is not an example of the issue that I am trying to solve.
This is an example that serves to show that try-except does not work in all cases, while “something that works well in all cases” is a focal point of this.