Expand structural pattern matching-like syntax to assignments/generalized unpacking

blhsing · March 6, 2023, 1:59pm

With the introduction of PEP-622, we have the elegant and succinct syntax of matching a value by an instance of a class while also assigning attribute values of the instance to variables, e.g.:

class Point:
    x: int
    y: int

p = Point(0, 1)
match p:
    case Point(x=0, y=y):
        print(f"Y={y} and the point is on the y-axis.")

Would it not be nice if we can have such a match-capture-and-assign mechansim in regular assignments as well? Think of it as more generalized unpacking.

Instead of:

assert isinstance(p, Point)
x = p.x
y = p.y

we can have:

Point(x, y) = p

or by keyword:

Point(x=x, y=y) = p

Instead of:

assert isinstance(p, Point) and p.x == 0
y = p.y

we can have:

Point(0, y) = p

Instead of:

assert isinstance(p, Point)
y = p.y

we can have:

Point(_, y) = p

Mapping patterns can be supported too, and like structural pattern matching, extra keys can be ignored.

Instead of:

p = {'x': 0, 'y': 1, 'z': 0}
x = p['x']
y = p['y']

we can have:

p = {'x': 0, 'y': 1, 'z': 0}
{'x': x, 'y': y} = p

Unlike a match-case construct, we always have only one matching pattern on the left in an assignment, so would it not be nice if we can also perform type conversions during a match?

Instead of:

for name, x, y in csv.reader(['top,0,1', 'right,1,0', 'top-right,1,1']):
    x = float(x)
    y = float(y)
    print(f'distance to {name} is {(x * x + y * y) ** .5}')

we can have:

for name, float(x), float(y) in csv.reader(['top,0,1', 'right,1,0', 'top-right,1,1']):
    print(f'distance to {name} is {(x * x + y * y) ** .5}')

Unlike structural pattern matching, an assignment with just classes or literals but no variables on the left should not be allowed:

Point(0, 1) = p # error
{'x': 0, 'y': 1} = p # error

jeanas · March 6, 2023, 2:48pm

The first thing that’s gonna be a problem is _ = x, since that’s currently valid but assigns to the variable _.

You can special-case it. It’s going to be a bit inconsistent, but I won’t express strong opinions.

barry-scott · March 6, 2023, 3:08pm

Using assert seems wrong here. asserts can be compiled out.

I would have expected to see an expection raise ValueError, TypeError as appropiate.

NeilGirdhar · March 6, 2023, 6:19pm

Just want to add that this new form would be better for typing. In the original form, x_f = float(x) would need a new name to keep type checkers from rightly complaining.

jeanas · March 6, 2023, 6:29pm

I don’t understand. case float(x): checks the subject for being a float and binds it to x, it doesn’t perform a conversion.

steven.daprano · March 6, 2023, 10:38pm

No, I don’t think it would be nice.

I think the premise here is wrong. The equivalent of a this match/case:

match p:
    case Point(x, y):
        ...

is not the assertion that p is a Point. So your assertion here is invalid:

assert isinstance(p, Point)
x = p.x
y = p.y

The actual equivalent would be:

if isinstance(p, Point):
    x = p.x
    y = p.y

Otherwise, we’re not replicating the same behaviour from pattern matching, but inventing a confusingly almost-the-same-but-different behaviour.

match...case gives us the match and case keywords to signal that something special is happening. Point(x, y) = p looks confusingly like p = Point(x, y), and it gives no clue to the reader what the behaviour is when p is not a Point.

You can’t have it both ways: you can’t have this Point(x, y) = p both match on p being a Point, and also do a conversion from some arbitrary object p to a Point.

The syntax can do one, or it can do the other. It can’t do both.

layday · March 7, 2023, 6:52am

Not very different from unpacking vs matching an iterable. Structural assertion vs no-match; type assertion vs no-match.

Rosuav · March 7, 2023, 7:13am

Current unpacking logic is “do the assignment, and if something fails, throw an exception”. But that is completely different from this:

and distinctly different from a match with a single case in it, which will simply do nothing if it doesn’t match. The closest equivalent to an inline assignment would not be the sinlgle case clause, but two:

@dataclass
class Point:
    x: int
    y: int

p = Point(0, 1)
match p:
    case Point(x=0, y=y):
        print(f"Y={y} and the point is on the y-axis.")
    case _:
        raise TypeError

These semantics could be seen as broadly equivalent to the assert-and-assign described. But there would be no conversion involved, and IMO there shouldn’t be.

If this sort of syntax were to be added, I would want to see the match keyword used in it somewhere, since the semantics of match/case are not quite identical to sequence unpacking. Otherwise, there’d be this weird disconnect regarding the _ special name (which is just a variable name elsewhere), which is bound to cause very very subtle confusion somewhere, possibly in a project that uses I18n tools. There’s also a notable, though perhaps also subtle, distinction between these two constructs:

[x, y, z] = iterable

vs

match iterable:
    case [x, y, z]: pass
    case _: raise TypeError

in that the first one will accept any iterable whatsoever, attempt to retrieve four values from it, and if it gets precisely three, assigns them; the second will ONLY match a sequence of length three. (Try this using an iterable of (print("Hello, world") for _ in range(3)) and then vary the length; it’ll never match the case statement, but always be attempted for the unpacking.)

layday · March 7, 2023, 11:15am

You were responding to, and I was responding to your response to:

Would it not be nice if we can have such a match-capture-and-assign mechansim in regular assignments as well?

I don’t know why you are quoting the type conversion proposition; I wasn’t commenting on that. What I’m saying is that a hypothetical:

@dataclass
class Point:
    x: int
    y: int

Point(x, y) = some_other_object  # Throws `TypeError`; `some_other_object` is not a `Point`

… has a very obvious parallel:

a_list = [1]
[a, b] = a_list  # Throws `ValueError`; `a_list` is a one-item list

In both cases, an “assertion” is (would be) performed on assignment, whereas a match case would fall through. The argument that we would be “inventing a confusingly almost-the-same-but-different behaviour” may be valid but it is also already the case for iterables. Please correct me if I misunderstood your point.

Edit: ahem, might’ve got you and Steven mixed up.

Rosuav · March 7, 2023, 11:31am

layday:

What I’m saying is that a hypothetical:
@dataclass
class Point:
    x: int
    y: int

Point(x, y) = some_other_object  # Throws `TypeError`; `some_other_object` is not a `Point`
… has a very obvious parallel:
a_list = [1]
[a, b] = a_list  # Throws `ValueError`; `a_list` is a one-item list
In both cases, an “assertion” is (would be) performed on assignment, whereas a match case would fall through. The argument that we would be “inventing a confusingly almost-the-same-but-different behaviour” may be valid but it is also already the case for iterables. Please correct me if I misunderstood your point.

Yes, but they’re still going to be confusingly similar, since the syntax in a match statement is NOT the same as the almost identical syntax in unpacking assignment. It’s true that they behave identically in the example you gave, but consider this:

>>> a_thing = iter([1, 2])
>>> [a, b] = a_thing # works
>>> a_thing = iter([1, 2]) # reinitialize since it was consumed
>>> match a_thing:
...     case [a, b]:
...             print(a, b) # nope doesn't match
...

When you do “[a, b] = thing”, Python does a two-element unpack of thing (which iterates over it three steps, and will fail if either it stops short of two or if it yields a third item), then assigns them. It will succeed or fail based on the actual results of iteration.

But “case [a, b]:” in a match statement is specifically a sequence unpack. It first queries the sequence to see if it has length 2. If it does, it THEN unpacks the sequence and assigns it. So it will never match something that is iterable but isn’t a sequence.

This is the sort of subtle difference that means that arbitrarily extending assignment to support types of match/case structures is going to create weird edge cases. And that’s why, in my opinion, it would be better to use the match keyword in the one-line assignment, such as “match [a, b] = some_seq” - that way, you guarantee that it’s using match/case semantics rather than regular unpacking.

Ah, yeah, we’re practically the same person. Happens all the time.

steven.daprano · March 8, 2023, 1:51am

Sure, but that’s not the behaviour of the match...case statement. Since this proposal is being explicitly described as pattern matching generalised to assignments, it is relevent that the behaviour is not the same as pattern matching.

I think it also hurts your case to keep mentioning “assertions” since the critical feature of assertions is that they can be turned off. This would have to include an implicit type check, not an assertion.

So rather than modelling Point(x, y) = p as equivalent to

match p:
    case Point(x, y): pass

we’re modelling it as:

if isinstance(p, Point):
    x = p.x
    y = p.y
else:
    raise TypeError

(This can be re-written as a match with two cases, but not one case.)

One problem I see is that iterable unpacking has many obvious applications for builtin types, but its harder to see useful applications for this when it comes to builtins. What are we going to do, write something like this:

float(x) = y

That’s just isinstance and raise in disguise.

Sure it works in match statements, but taken out of the context of multiple cases, its a bit too implicit and not useful enough.

I don’t know, I’m slightly warming to the idea, but I don’t know that I’m warm enough to support adding yet more syntax to the language.

Rosuav · March 8, 2023, 2:08am

Steven D'Aprano:

One problem I see is that iterable unpacking has many obvious applications for builtin types, but its harder to see useful applications for this when it comes to builtins. What are we going to do, write something like this:
float(x) = y
That’s just isinstance and raise in disguise.

True, that’s a fairly weak example. But what about this? (I’m adding the match keyword because IMO it needs to be there due to the semantic differences.)

data = json.loads(some_message)
match {"cmd": command, "msg": message} = data

Dictionary unpacking gets requested periodically, and while I’m hardly convinced that the best way to do dict unpacking is to implement all of match/case’s power in an assignment statement, it is certainly a viable way to do it.