Relax requirement for sequence pattern matches in PEP 622

Currently, implementation requires the subject to be a subclass of collections.abc.Sequence for sequence pattern matches to work. Which seems to be excessive since all it need is unpacking.

For example, NumPy’s arrays cannot be used in sequence patterns, although they do support unpacking (which was surprising to me, as a user) (see ndarray should derive from collections.abc.Sequence? · Issue #2776 · numpy/numpy · GitHub)

Perhaps a special attribute (like __match_args__ ) could be added to allow classes declare support for sequence pattern matching?

2 Likes

The answer is in the thread you posted. Only some numpy arrays are sequences. Therefore it is correct that for them not to match with sequences. (Specifically consider a zero dimensional array.)

Also, the new array API doesn’t implement the sequence interface (it doesn’t expose len)

You should read the thread you linked if you want to understand better.

2 Likes

Unfortunately, we also require __len__ and indexing by non-negative integers. For example, after checking whether isinstance(subject, collections.abc.Sequence) is true, a pattern like [*_, last]:

  • checks that the length is greater than or equal to 1
  • binds last = subject[len(subject) - 1]

Now imagine if the subject was some huge (or infinite!) iterable. It would take much longer to match this simple pattern. Matching iterators would also be much more complicated under-the-hood, since you need to “remember” values that have already been yielded when matching against multiple patterns (and with multiple nested patterns, it’s prohibitively difficult).

This is why simply being iterable isn’t enough (besides the fact that mappings like dictionaries are iterable, and would match sequence patterns).

Well, we need a length, iterability, and indexing by non-negative integers. Sounds like a Sequence to me! :wink:

2 Likes

My overall understanding is that ndarray cannot be a subclass of collections.abc.Sequence in general, because (a) it’s not clear what API promise users expect since there are multiple approaches to represent ndarray as a sequence and (b) as you say, 0D array cannot be iterated over.

However, for this request the expectations are not general, but specific. Please correct me if I’m wrong, but 1+D arrays seem to satisfy the requirements outlined by @brandtbucher.

Assuming the requirement for subclassing is lifted, do you see why 1+D arrays couldn’t be unpacked for sequence pattern matching?

As far as I know, there is currently no way for an object to conditionally implement an interface based on its value. And yet this seems to be what you’re asking.

Anyway, I think it’s bad to think of arrays as sequences since the array API doesn’t even come close to implementing it.

Why not just match against ndarray?

In my case I was working with 3x3 matrices and wanted do something like this (plenty of similar ops):

matrix: ndarray = ...
match matrix:
    case [_, _, (0, 1, 0)]:
        ...

I guess I can write a wrapper around ndarray that would implement “Sequence” for sequence matching to work.

As far as I know, there is currently no way for an object to conditionally implement an interface based on its value.

Say Python devs rewrite the checking code for sequence matching to look for the value of the _sequence_matchable_ attribute rather than checking for isinstance(..., collections.abc.Sequence. With that ndarray could implement this attribute to return True for 1+D and False for 0D.

Are there objections against supporting sequence matching other than that ndarray cannot be a subclass of collections.abc.Sequence?

That’s a very reasonable solution IMO. If you ensure that ndim > 0, it’s perfectly reasonable to expose the sequence interface.

Yeah, so isinstance(x, T) would answer differently than issubclass(type(x), T), which I think would be a huge change.

I think it’s bad because it’s not true in general, and object-based instance check seems wrong. Your subclass idea is better IMO.