Pattern Matching | Add Support for Set Literals

Abstract:

The current implementation of Python’s structural pattern matching does not support direct matching against set literals, limiting concise and expressive pattern matching with sets. This proposal suggests enhancing pattern matching to natively support set literals, including patterns that capture elements, while acknowledging the inherent limitations of working with unordered collections like sets.

Background:

Structural pattern matching, introduced in Python 3.10, allows for expressive and readable code when dealing with complex data structures. However, it currently does not accommodate direct matching with set literals or patterns that capture elements within sets. Sets are unordered collections, which introduces specific constraints on how pattern matching could work with them.

Problem Statement:

Attempting to use pattern matching with set literals and capture patterns results in a SyntaxError. For example:

example = {1, 2, 3, 4}

match example:
    case {1}: # SyntaxError
        print("Contains only 1")
    case {1, other}:
        print(f"Contains 1 and another element: {other}")
    case {1, 2, *other}:
        print(f"Contains 1, 2, and other elements: {other}")
    case _:
        print("Other set")

This code raises a SyntaxError.

Given that sets do not maintain any specific order, certain pattern matching syntaxes that work with ordered collections (such as lists or tuples) cannot be directly applied to sets. Specifically:

  • Positional matching with multiple placeholders (e.g., case {a, 2, 3, b} or {a, 2, 3, *b}) is not valid for sets because there is no defined order in sets. In this case, it is unclear what element should be assigned to the positional variable a and what should be placed into the catch-all variable *b. This ambiguity arises due to the unordered nature of sets.
  • Type checking can still be used in patterns, like case {int(a), 2, 3}:, but this is limited to patterns where there is only one “catch-any” variable (i.e., no use of multiple *args-like syntax in set matching).
  • The catch-all pattern (*other) in set literals can only be used alone and once, as there is no guarantee of order in sets, and sets cannot be “unpacked” in the same way as ordered collections like lists or tuples.

Proposal:

Extend the pattern matching syntax to support set literals and capture patterns, enabling expressions such as:

match example:
    case {1}:
        # Handle case where example is {1}
    case {1, other}:
        # Handle case where example contains 1 and another element
    case {1, 2, *rest}:
        # Handle case where example contains 1, 2, and other elements
    case _:
        # Handle all other cases

However, the following limitations should be clearly noted:

  • Positional pattern matching for multiple placeholders (e.g., {a, b, c}) is not valid for sets. The unordered nature of sets means there is no guarantee of the positions of elements. Therefore, it is unclear where to place each element in the pattern, making positional matching impossible.
  • Pattern “catch-all” (*other) can be used only once and without other patterns, as there is no defined order of elements in sets. This limitation ensures there is no ambiguity about how the elements are matched.
  • Type checking can be applied, e.g., case {int(a), 2, 3}:, but without the full flexibility that ordered collections (like lists or tuples) offer. Open for discussion.

Benefits:

  • Conciseness: Eliminates the need to convert sets to other data types for pattern matching.
  • Readability: Provides clear and direct expressions of intent when working with sets.
  • Consistency: Aligns pattern matching capabilities across different collection types, even if sets introduce some specific limitations due to their unordered nature.

Reference:

Discussion:

Implementing this feature requires careful consideration of backward compatibility and potential parsing ambiguities. Since sets do not guarantee order, it is important to ensure that the semantics of pattern matching align with the characteristics of sets.

Conclusion:

Enhancing Python’s pattern matching to support set literals and capture patterns will improve the language’s expressiveness and consistency, benefiting developers working with set data structures, while acknowledging the limitations due to the unordered nature of sets.

2 Likes

Thanks for a well-explained proposal. The biggest thing it’s missing is a justification for the feature - you’ve taken the position that having pattern matching on sets is obviously something that’s worth having, and I don’t think that assumption is valid.

Can you give some examples, ideally from real-world code, where pattern matching on sets is a significant improvement? The limitations imposed by the unordered nature of sets make many of the destructuring benefits of general match statements unavailable, so you’re left with a weak “consistency” argument for preferring a match statement over the equivalent chain of if statements.

Furthermore, for the destructuring that is still possible, I’m not sure how practically useful it is. For example,

case {1,2,other}:

is basically saying “if a set contains 3 elements, two of which are 1 and 2, give me the other one”. I can’t think of a case where I’ve ever needed to do anything like that, to the point where I don’t even know how I’d write it in current Python code. Something like:

if {1,2} < example:
    other, = example.difference({1,2}) # Exception if there's more than 3 elements

maybe? That’s pretty clumsy[1], and while that argues that destructuring would be useful to replace it, it also argues that no-one needs to do this, as otherwise we’d have had demands for something better than what we have right now for ages…


  1. as well as being non-obvious - my first attempt was way less clean! ↩︎

2 Likes

+1, I think that would be a useful addition, since sets currently fall through the cracks of match only supporting sequences and mappings, and sets being neither of those.

To Paul’s points:

  • Dictionaries don’t have order either (well, in theory at least, they are ordered now in Python) and match does support mappings without looking for a particular order

  • The *rest use case is actually very common when using match with dictionaries: you essentially say, that any mapping which the given key-value pairs matches, and that you want all the remaining entries placed into the capturing variable rest.

7 Likes