Revisit PEP-3113

With Python 3, parameter unpacking in function prototypes was removed. The justification given (PEP 3113) was that introspection was difficult. However, in the process, a really expressive capability was lost.

What PEP-3113 didn’t consider* was extending function metadata to support the missing introspection capability. [Note: that introspection capability wasn’t really missing per se, but the CPython implementation depended upon bytecode parsing.]

Given that Python seems to be moving towards more pattern-matching capabilities (with things like the match statement), I wonder if there’s much interest in reinstating destructuring matching for method parameters - together, perhaps, with filling out the missing metadata to continue to support introspection?

(* at least, there’s no mention of it)

7 Likes

I wasn’t around back in 2007 when PEP 3113 was written, but the document indicates that the feature was very rarely used.

Could you elaborate on what was lost? The PEP itself says there was no loss in functionality, since

def f(a, (b, c)):
    return c

could be rewritten

def f(a, b_c):
    (b, c) = b_c
    return c

or other similar equivalents.

With the advent of type annotations, this is arguably even further improved. You can write

def f(a: object, b_c: tuple[object, object]): ...

if that is your intent.

If it is useful, a new discussion is worth having. But you need to prove why it’s worth adding today, as a brand new feature. The reasons that it was removed in the past might not apply anymore. In that case, what has changed such that it is newly important to add as syntax?

match-case is not direct evidence in favor of the feature. Python doesn’t support destructuring arbitrary types in arbitrary contexts – only unpacking in specific places and __match_args__ in case statements. It’s not unrelated, but if this feature is useful it should be arguable on its own merits.

1 Like

I think it’s important to draw a distinction between functionality and expressiveness.

It might be rarely used in the stdlib, but if you are doing anything like computational geometry (which doesn’t show up there) then it’s incredibly useful to manage parameter unpacking.

This isn’t so much “newly important”, just an unfortunate pain that crops up every time. The PEP was driven from the point of view that introspection is far more important than expressiveness in function declarations. I’d argue that that’s probably not actually true; but that document doesn’t mention “considered and rejected” for actually fixing function metadata to provide the missing capabilities to the introspection module.

As you say, I can write

def f(a: object, b_c: tuple[object, object]):
    (b, c) = b_c

but my intent is actually to write

def f(a: object, (b, c): tuple[object, object]):

which doesn’t have the clutter.

4 Likes

Sure, I can easily agree. But the question of “what are you trying to express?” is still what I’m getting at.

Where and how often does this naturally arise? You say “computational geometry” (which I know nothing about, beyond what the name implies), so are there examples of this in numeric libraries?

We know the stdlib doesn’t cover all domains. But it’s often used as a proxy for a kind of “developer survey” because it’s not feasible to reach out to all Python developers for feedback on each language decision. I’m looking for some evidence that there’s a group of developers for whom this would be valuable – obviously we know of one, but more than one is necessary to justify a feature.

I think I prefer

def f(a: float, (b: float, c: float)):

Or maybe if the annotation goes after the tuple then it could refer to the elements of the tuple like (b, c): float although I guess that only works if they are homogeneously typed.

As for what this is needed for it is just that sometimes you have a lot of functions that pass around tuples where the first thing every function does is to unpack the tuples into new names. I assume that the reference to computational geometry is just that there are lots of tuples (or unpackables) of things like coordinates being passed around e.g.:

def distance1(point1: Point2D, point2: Point2D) -> float:
    return ((point1[0] - point2[0]) ** 2 + (point1[1] - point2[1]) ** 2) ** 0.5

def distance2(point1: Point2D, point2: Point2D) -> float:
    ((x1, y1), (x2, y2)) = (point1, point2)
    return ((x1 - x2) ** 2 + (y1 - y2) ** 2) ** 0.5

def distance3((x1, y1): Point2D, (x2, y2): Point2D) -> float:
    return ((x1 - x2) ** 2 + (y1 - y2) ** 2) ** 0.5
10 Likes

I mourned the loss just last month:

14 Likes

I’d be pretty happy to have this functionality back. I don’t need it, but it does more succinctly describe intent in a few cases where a named tuple might be used otherwise (and there are other downsides to a named tuple worth avoiding, it has a bigger api surface than what’s actually desired for any sort of semantic versioning)

4 Likes

That’s useful to see as context, thanks!

Allowing lambda(x,y): ... may conflict with ideas which are being batted around for allowing the use of lambda(…) for a more expressive Callable syntax. See PEP 677 with an easier-to-parse and more expressive syntax
It’s not a deal-breaker, just something to be aware of.

I prefer this as well. The tuple[object, object] example was, I think, for parity between (b, c) and b_c in context.

Requiring type homogeneity is already somewhat messy with *args and **kwargs.

I think the benefits here don’t justify the increased noise in the function signature.

3 Likes

Exactly. It was a definite loss in expressiveness for ad-hoc lambda functions, where a separate unpacking statement is impossible.

To address introspection concerns I propose that we reimplement tuple parameters by storing them in co_varnames as tuple expressions enclosed in parentheses, while also storing in co_varnames each name referenced in the tuple expressions, positioned after all parameters but before other local variables.

So that:

def f(a, (b, c), d):
    e = 1

assert f.__code__.co_varnames == ('a', '(b, c)', 'd', 'b', 'c', 'e')
assert f.__code__.co_argcount == 3
assert f.__code__.co_nlocals == 5

is compiled like:

def f(a, b_c, d):
    (b, c) = b_c
    e = 1

except b_c is never stored as a name.

When I did my company’s Python 3 migration, I felt I had to make some code uglier because of the loss of this feature. For example, sorting a list of (length, width) tuples would be nicer as sorted(pairs, key=lambda (length, width): width) than sorted(pairs, key=lambda pair: pair[1]). But Python 2 is long gone now and this need doesn’t come up that often. We’re probably not going to add the Python 2 syntax back. Maybe there’s a way to get a more general match-like syntax into function definitions.

6 Likes

I think the need doesn’t appear to come up often because people have worked around the loss by designing callbacks in newer APIs with flattened parameters instead, even when structured parameters would make parameter groupings clearer.

I did consider generalizing my proposal above to incorporate a match-like syntax as well, but couldn’t find an elegant way to fit annotations into the syntax, while limiting the syntax to just tuple unpacking makes annotations a natural fit, as suggested by @oscarbenjamin.

If we wanted match-case for def, I’d think of it in terms of “how to define match-def or case-def”, in which the cases are different suites for the function body. This is aligned with the functional languages which have destructuring matches baked into function definition syntax.

Although I’m not sure how much I like the results yet, playing with this looks interesting to me.

My first thought is to inline match def to mean “a function whose body starts with a match on its parameters”, e.g.,

match def simple_euclidean(
    p0: tuple[float, float],
    p1: tuple[float, float],
) -> float:
    case ((x0, y0), (x1, y1)):
        return math.sqrt(
            (x1 - x0) ** 2 + (y1 - y0) ** 2
        )
    case _:
        raise TypeError(f"Expected two 2-tuples, got: {p0!r}, {p1!r}")

That is practically the same as we can already write today, so it doesn’t seem to me like anyone would be satisfied with it. But perhaps the match itself should be dropped, so that the def line is the pattern match?

case def simple_euclidean(
    (x0: float, y0: float), (x1: float, y1: float)
) -> float:
    return math.sqrt((x1 - x0) ** 2 + (y1 - y0) ** 2)
case def simple_euclidean(
    (x0: float, y0: float, z0: float), (x1: float, y1: float, z1: float)
) -> float:
    return math.sqrt((x1 - x0) ** 2 + (y1 - y0) ** 2 + (z1 - z0) ** 2)

That reads very nicely to me[1], but I don’t like the repetition of def simple_euclidean and the return type.
And I wonder about what case _ should mean (or if it should even be supported).

If we go back to match def, but use case on the parameters, some interesting possibilities emerge:

match def simple_euclidean -> float:
    case (
        (x0: float, y0: float), (x1: float, y1: float)
    ):
        return math.sqrt((x1 - x0) ** 2 + (y1 - y0) ** 2)
    case (
        (x0: float, y0: float, z0: float), (x1: float, y1: float, z1: float)
    ):
        return math.sqrt((x1 - x0) ** 2 + (y1 - y0) ** 2 + (z1 - z0) ** 2)

What I like about this option is that it allows you to express things unrelated to the unpacking case which currently require typing.overload with a little tweak to that return type annotation.

class StringHolder:
    match def get_value:
        case (self, key: str) -> str:
            return self.data[key]
        case (self, key: str, *, converter: Callable[[str], T]) -> T:
            return converter(self.data[key])

I like this last one. It’s the sort of code that I’ve felt subtle pressure to stop writing as type annotations have become part of my day-to-day, since the overload syntax makes it “too costly” to be worth combining APIs where it’s easily avoidable. (Instead I’ll write two methods, get_value and get_and_convert_value, or whatever is appropriate to the problem space.)

As for case _, I omit it because the natural default here would be a TypeError. Arguably, the fallthrough case here would be *args, **kwargs anyway, so we could write things like…

class StringHolder:
    match def get_value:
        case (self, key: str) -> str:
            return self.data[key]
        case (self, key: str, *, converter: Callable[[str], T]) -> T:
            return converter(self.data[key])
        case (self, key: str, *args: Any, converter: Callable[[str], T], **kwargs: Any) -> T:
            warnings.warn(
                f"unrecognized parameters detected: {args!r}, {kwargs!r}"
            )
            return converter(self.data[key])
        case (self, key: str, *args: Any, **kwargs: Any) -> str:
            warnings.warn(
                f"unrecognized parameters detected: {args!r}, {kwargs!r}"
            )
            return self.data[key]

Existing match-case unpacking covers parts of this, but I think the dict unpacking requirements for keyword arguments would be largely new.


  1. I will accept your stones, go ahead and throw them! :grin: ↩︎

2 Likes

Why would well-named, structurally meaningful parameters be considered “noise”?

Having to come up with names for intermediate parameters just so they can be unpacked into useful names in the first statement, or worse, be referenced by index in case of a lambda, is what I would call noise.

3 Likes

Yes, this is the real loss. For full def suites it’s not a big deal, but the effect on lambdas makes some code noticeably less readable. I always thought it was a bad idea to remove this functionality.

1 Like

How should a function or code object be redesigned for introspection of function signature to be viable with this syntax? And how should names in non-tuple structured patterns (class attributes, dict values, etc.) be annotated?

1 Like

My instinct on this, if we even think it’s worth pursuing, is that there are multiple distinct function objects, bundled in a wrapper (which is not, itself, a function). So introspection provides a tuple of functions. I think that expresses the idea best.

Mixing in type annotations doesn’t seem as difficult to me as turning arbitrary cases into signatures. That’s something for which I don’t have a ready answer.
And I’m not sure if the idea composes well with decorators in the way I’ve phrased it.

Even with these open questions, match-case as a flavor of multiple dispatch strikes me as interesting.

I like the idea of match def. It helps unifying the @typing.overload and match syntaxes. I have written overloads similar to this:

class RGBAColor:
    @overload
    def __init__(self, rgba: int, /) -> None: ...

    @overload
    def __init__(self, hex_str: str, /) -> None: ...

    @overload
    def __init__(self, r: int, g: int, b: int, a: int = 255, /) -> None: ...

    def __init__(self, *args) -> None:
        """Return an RGBAColor instance. Accepts #RRGGBBAA hex code, a hexadecimal 32-bit integer representing that value, or separate color components.
        """
        match args:
            case [int(rgba)]:
                # do something with hex integer
                self._rgba = rgba
            case [str(s)]:
                self._rgba = int(s.removeprefix("#"), base=16)
            case [int(r), int(g), int(b)]:
                self._rgba = r << 24 | g << 16 | b << 8 | 0xFF
            case [int(r), int(g), int(b), int(a)]:
                self._rgba = r << 24 | g << 16 | b << 8 | a
            case _:
                raise TypeError("Unrecognized arguments")

This is a simplified example (I haven’t included the keyword argument variant yet :melting_face:). When writing function overloads, the repetition of parameter names and types in the overloads and in the match cases is error-prone and difficult to update. In the case of RGBAColor I end up making them separate factory methods to avoid dancing around within the constructor, but I can imagine use cases (like geometry mentioned above) in which many different shapes and sizes need to be unpacked. Even if in match def we probably won’t get the full functionality of match statements (such as guards case int(foo) if foo > 42:) nor support many of the special forms of type hints (only the isinstance-able ones, no generics), I think it’s still worth a try.

One thing to note is that doing this will somewhat couple typing constructs with syntax. Also as noted above runtime introspection will need to be enhanced, right now we havetyping.get_overloads that returns a sequence of original function declarations which can be inspected for signature.

I think the overloads just make it clear that it is better often to have separate functions rather than squeezing them into a single function. When I’ve seen this pattern in functional languages with analogues of match def it is used to dispatch on values rather than types.

I know this is a toy demo example but I have found myself looking at real code like this and thinking things like: why would anyone need that converter parameter in the first place? You don’t even need get_and_convert_value because it is better for the caller to write

x = convert(obj.get_value(key))

rather than

x = obj.get_value(key, converter=convert)

This is my general experience as well, but when overriding special methods this isn’t possible.

In this case we will need match def to be able to work with values, shapes as well as types.

But there is a problem with this approach: an argument value that does not match should raise ValueError, not TypeError as proposed. Just as an input sequence too short should raise IndexError, a dict that does not contain a certain key should raise KeyError. The current match statement mechanism does not offer a way to distinguish between these mismatches, and a case _: raise TypeError is too confusing if we start matching on things other than types. Using separate functions avoids the problem of confusing exceptions being reported, since each function simply raises its exception of choice.