Revisit Mutable Default Arguments

It should evaluate every time make_list() is called.

I think that’s what we’re after, even if it’s obvious that there hasn’t been a good-enough solution so far (less so when considering dataclass).

And we’re back to the magic. Somehow it needs to know that the function’s being called, which is a feature independent of deferred evaluation, and which makes the actual deferred evaluation completely irrelevant. Once you have it being re-evaluated for every call, any form of deferral just delays the evaluation from “the function’s being called, construct a new list” to “first use within the function triggers evaluation”.

Additionally, for anything with side effects, this would cause additional unnecessary confusion, either by being called after some other code, or perhaps not even being called at all.

And we’re back to the magic.

distilling that notion:

foo = []

def bar(arg=foo): pass

What does it mean to “evaluate foo every time bar is called”, exactly? foo is already evaluated.

I agree that some kind of “magic” will be required.

I also agree that defer any_expression is a rabbit hole.

All we need is deferred literals for list, dict, and set, which are the only constructors supported by the current syntax,… along with str

This becomes a problem with deferred values:

def f(x=f'say {hello}!'):

Also this:

def f(x=[a, b ,c]):

Could we get away with the semantics of copy.copy() for the semantics of mutable default arguments? The copy() part could be skipped for basic builtin types.

Yet copy() doesn’t solve this case:

def f(x=SomeObject(a, b, c))

That’s why I think we shouldn’t go into the rabbit hole, and just provide new syntax (“magic”) that handles the [] and {} cases, which are the only constructors that produce mutable values.

Or just… Don’t? New syntax is expensive. There has to be sufficient justification for it.

In my mind there are two reasons to try to change this: to remove boilerplate code and to make things less confusing for new users.

The first case can be solved without any new syntax. Magic definitely isn’t going to solve the latter case.

How???

Magic definitely isn’t going to solve the latter case.

I must concede to that, because the semantics are already complex, and the fix would likely make them more so.

Still, we should come with ideas to solve the ugliness in:

def f(x: list[Any] | None=None):

and:

@dataclass
class C:
   x: list[Any] | None = None
   y: list[Any] = field(default_factory=list)

It should be something like:

@dataclass
class C:
   x: list[Any] = ...[]
   y: list[Any] = ...[]

I don’t think that dataclass is an issue, because it already makes provisions for field(), so it could do them for anything else.

And now, this!

In [1]: x = []

In [2]: y = eval('x')

In [3]: y is x
Out[3]: True

With the decorator that I’ve already posted in this thread? That covers the mutable default case and allows for arbitrary default objects, not just a couple literals.

It doesn’t cover the dataclass case but that case already has other tooling. It doesn’t make type annotations nicer but I don’t think any of the proposed syntax changes do either.

This is evaluated when bar is defined, not when it is called. If you have this return its argument, you will see that its id never changes.

That’s my point… I am asking how on earth this is supposed to work differently [edit: without magic, which seems like a bad idea].

This is a possible approach to the semantics using magic (syntax pending).

>>> x = ...[]
>>> x.append(2)
AttributeError: blah blah

That means tha deferred values are not left-hand-sides for the embedded value.

>>> x = ...[]
>>> y = x + [1]
>>> y
[1]
>>> x  = x + [2]  # overriden
>>> x
[2]
>>> z = x + [3]
>>> z
[2, 3]

The magical deferred values are evaluated when used on a right hand side, and not on lhs.

EDIT:

Python already makes special cases for literals of type int, float, str, dict versus set, etc. A new special case would be OK if it is “easy to explain”, and less complex.

This can be implemented with x = dlambda: [] with the dlambda being evaluated only when it appears on a rhs.

This is where the common colloquial use of “literal” to refer to [] and its friends actually proves itself to be false. These are not actually literals, and they don’t evaluate to specific values the way that int/float/str literals do.

That is not enough of a definition. See earlier in this thread. This is not a simple thing and anyone who says “why don’t you just…” is almost certainly missing something big. Simplified semantics like this result in paradoxes.

Can someone please take it on themselves to adopt the deferred expressions proposal, write it up in full, and take charge of defending it? This is NOT MY PROPOSAL. Do you see why I get tired of this? Every time PEP 671 gets mentioned, someone says “Why don’t you just make deferred expressions instead?” as if it’s something easy and would beautifully solve all of my problems.

4 Likes

“every time bar is called”, with foo as a late-binding argument, would mean whatever foo is bound to at the time of bar’s call is what would be bound to arg.

2 Likes

Yes. Not very useful towards the actual apparent goal of not having mutable defaults, which was my point.

That is not the goal. The goal is to have mutable defaults that are distinct per function call. (Perhaps a subtle point, but an important one.)

That’s a good point. If the default was to evaluate every time, then it could (?) be used for the late-binding use-case. If a particular deferred function only wanted to be evaluated once, it could cache its result for future calls.

That would be a nightmare.

My idea was to devise “something” that works for literal constructors ([], {}), which are the source of unwanted bugs and confusion for newcomers.

Yes. The magic should make the expression behave like a function call that produces a new value on every invocation.

In this expression:

x = defer []
y = x
assert y is x

The x gets bound to a new instance of [], exactly like in:

x = (lambda : [])()

dataclass already cheats over the result from field(), so it can cheat again with whatever solves the mutable default argument issue.

In fact, defer (or whatever the syntax) could be a special form of lambda that gets evaluated at will, on default argument resolution, on dataclass instantiation, or wherever it’s useful.

In the case of:

def f(x=defer [a, b]):

The a, and b would be resolved at declaration time (least surprise), yet each invocation of the function would produce a new list, as if lambda: [a, b]() is evaluated.

Okay but “let’s add some magic” is not a specification or a path toward implementation. Nor is there any attempt at justifying the addition of new syntax for this feature. So it doesn’t seem like this is going anywhere?

Going back to the OP, I don’t see how any of the proposals would improve the type annotation, either.

Lists and dicts aren’t the whole problem though. I pretty regularly see functions with defaults like now=time.time(). Then the default is the time the function was imported, not the time the function was called, which is likely what the user wanted. Similarly, objects other than lists and dicts can be mutable and cause problems when used as defaults.

2 Likes

I let my subconscious on it, and PEP 671 is correct.

The solution is a new assignment symbol that produces a deferred/late-bound values.

It would solve the mutable default arguments case, and it would be useful in other cases when late binding is convenient.

1 Like