I’m in the process of trying to reimplement structural pattern matching for Cython. I’m just trying to understand some of the implications of the specification.
Let’s suppose we have a sequence pattern. As I understand it the implementation can look up the length once, and cache it:
- “The length of the subject sequence is obtained using the builtin
len()function (i.e., via the
__len__protocol). However, the interpreter may cache this value in a similar manner as described for value patterns.”
- The description for value patterns “the interpreter may cache the first value found and reuse it, rather than repeat the same lookup. (To clarify, this cache is strictly tied to a given execution of a given match statement.)”. I’m noting that it says “match statement” rather than “case statement”
However, guards are explicitly allowed to have side effects.
Therefore, if we have code like this:
def f(x): match x: case [1, _] if (hasattr(x, "pop") and x.pop()): return 1 case [1, _]: return 2 case _: return 3 f([1, 0])
What do we consider an acceptable outcome? I can see three options:
return 3- the second case fails because the length is no longer 2. The is what CPython currently looks to do.
IndexErroron case 2. The interpreter uses the cached length and then fails to read the second element.
- Hard crash on case 2. The interpreter uses the cached length, and directly accesses the memory of the second element.