Revisit Mutable Default Arguments

apalala · October 30, 2023, 9:09pm

Chris Angelico:

def make_list(basis=[]):
    basis.append(42)
    basis.append(spam)
    print(basis)
Is that enough reason to guarantee that it evaluates only once? I think it would be EXTREMELY surprising if this printed out an empty list.

It should evaluate every time make_list() is called.

I think that’s what we’re after, even if it’s obvious that there hasn’t been a good-enough solution so far (less so when considering dataclass).

Rosuav · October 30, 2023, 9:23pm

And we’re back to the magic. Somehow it needs to know that the function’s being called, which is a feature independent of deferred evaluation, and which makes the actual deferred evaluation completely irrelevant. Once you have it being re-evaluated for every call, any form of deferral just delays the evaluation from “the function’s being called, construct a new list” to “first use within the function triggers evaluation”.

Additionally, for anything with side effects, this would cause additional unnecessary confusion, either by being called after some other code, or perhaps not even being called at all.

bryevdv · October 30, 2023, 9:37pm

And we’re back to the magic.

distilling that notion:

foo = []

def bar(arg=foo): pass

What does it mean to “evaluate foo every time bar is called”, exactly? foo is already evaluated.

apalala · October 30, 2023, 9:52pm

I agree that some kind of “magic” will be required.

I also agree that defer any_expression is a rabbit hole.

All we need is deferred literals for list, dict, and set, which are the only constructors supported by the current syntax,… along with str…

This becomes a problem with deferred values:

def f(x=f'say {hello}!'):

Also this:

def f(x=[a, b ,c]):

Could we get away with the semantics of copy.copy() for the semantics of mutable default arguments? The copy() part could be skipped for basic builtin types.

Yet copy() doesn’t solve this case:

def f(x=SomeObject(a, b, c))

That’s why I think we shouldn’t go into the rabbit hole, and just provide new syntax (“magic”) that handles the [] and {} cases, which are the only constructors that produce mutable values.

jamestwebber · October 30, 2023, 10:11pm

Or just… Don’t? New syntax is expensive. There has to be sufficient justification for it.

In my mind there are two reasons to try to change this: to remove boilerplate code and to make things less confusing for new users.

The first case can be solved without any new syntax. Magic definitely isn’t going to solve the latter case.

apalala · October 30, 2023, 10:44pm

How???

Magic definitely isn’t going to solve the latter case.

I must concede to that, because the semantics are already complex, and the fix would likely make them more so.

Still, we should come with ideas to solve the ugliness in:

def f(x: list[Any] | None=None):

and:

@dataclass
class C:
   x: list[Any] | None = None
   y: list[Any] = field(default_factory=list)

It should be something like:

@dataclass
class C:
   x: list[Any] = ...[]
   y: list[Any] = ...[]

I don’t think that dataclass is an issue, because it already makes provisions for field(), so it could do them for anything else.

And now, this!

In [1]: x = []

In [2]: y = eval('x')

In [3]: y is x
Out[3]: True

jamestwebber · October 30, 2023, 11:25pm

With the decorator that I’ve already posted in this thread? That covers the mutable default case and allows for arbitrary default objects, not just a couple literals.

It doesn’t cover the dataclass case but that case already has other tooling. It doesn’t make type annotations nicer but I don’t think any of the proposed syntax changes do either.

Rosuav · October 30, 2023, 11:59pm

This is evaluated when bar is defined, not when it is called. If you have this return its argument, you will see that its id never changes.

bryevdv · October 31, 2023, 12:00am

That’s my point… I am asking how on earth this is supposed to work differently [edit: without magic, which seems like a bad idea].

apalala · October 31, 2023, 12:05am

This is a possible approach to the semantics using magic (syntax pending).

>>> x = ...[]
>>> x.append(2)
AttributeError: blah blah

That means tha deferred values are not left-hand-sides for the embedded value.

>>> x = ...[]
>>> y = x + [1]
>>> y
[1]
>>> x  = x + [2]  # overriden
>>> x
[2]
>>> z = x + [3]
>>> z
[2, 3]

The magical deferred values are evaluated when used on a right hand side, and not on lhs.

EDIT:

Python already makes special cases for literals of type int, float, str, dict versus set, etc. A new special case would be OK if it is “easy to explain”, and less complex.

This can be implemented with x = dlambda: [] with the dlambda being evaluated only when it appears on a rhs.

Rosuav · October 31, 2023, 1:02am

This is where the common colloquial use of “literal” to refer to [] and its friends actually proves itself to be false. These are not actually literals, and they don’t evaluate to specific values the way that int/float/str literals do.

That is not enough of a definition. See earlier in this thread. This is not a simple thing and anyone who says “why don’t you just…” is almost certainly missing something big. Simplified semantics like this result in paradoxes.

Can someone please take it on themselves to adopt the deferred expressions proposal, write it up in full, and take charge of defending it? This is NOT MY PROPOSAL. Do you see why I get tired of this? Every time PEP 671 gets mentioned, someone says “Why don’t you just make deferred expressions instead?” as if it’s something easy and would beautifully solve all of my problems.

stoneleaf · October 31, 2023, 1:39am

“every time bar is called”, with foo as a late-binding argument, would mean whatever foo is bound to at the time of bar’s call is what would be bound to arg.

bryevdv · October 31, 2023, 4:39am

Yes. Not very useful towards the actual apparent goal of not having mutable defaults, which was my point.

stoneleaf · October 31, 2023, 2:18pm

That is not the goal. The goal is to have mutable defaults that are distinct per function call. (Perhaps a subtle point, but an important one.)

stoneleaf · October 31, 2023, 2:24pm

That’s a good point. If the default was to evaluate every time, then it could (?) be used for the late-binding use-case. If a particular deferred function only wanted to be evaluated once, it could cache its result for future calls.

apalala · October 31, 2023, 3:19pm

That would be a nightmare.

My idea was to devise “something” that works for literal constructors ([], {}), which are the source of unwanted bugs and confusion for newcomers.

apalala · October 31, 2023, 3:30pm

Yes. The magic should make the expression behave like a function call that produces a new value on every invocation.

In this expression:

x = defer []
y = x
assert y is x

The x gets bound to a new instance of [], exactly like in:

x = (lambda : [])()

dataclass already cheats over the result from field(), so it can cheat again with whatever solves the mutable default argument issue.

In fact, defer (or whatever the syntax) could be a special form of lambda that gets evaluated at will, on default argument resolution, on dataclass instantiation, or wherever it’s useful.

In the case of:

def f(x=defer [a, b]):

The a, and b would be resolved at declaration time (least surprise), yet each invocation of the function would produce a new list, as if lambda: [a, b]() is evaluated.

jamestwebber · October 31, 2023, 3:35pm

Okay but “let’s add some magic” is not a specification or a path toward implementation. Nor is there any attempt at justifying the addition of new syntax for this feature. So it doesn’t seem like this is going anywhere?

Going back to the OP, I don’t see how any of the proposals would improve the type annotation, either.

Jelle · October 31, 2023, 3:39pm

Lists and dicts aren’t the whole problem though. I pretty regularly see functions with defaults like now=time.time(). Then the default is the time the function was imported, not the time the function was called, which is likely what the user wanted. Similarly, objects other than lists and dicts can be mutable and cause problems when used as defaults.

apalala · November 7, 2023, 4:41pm

I let my subconscious on it, and PEP 671 is correct.

The solution is a new assignment symbol that produces a deferred/late-bound values.

It would solve the mutable default arguments case, and it would be useful in other cases when late binding is convenient.