I’m in the process of trying to reimplement structural pattern matching for Cython. I’m just trying to understand some of the implications of the specification.
Let’s suppose we have a sequence pattern. As I understand it the implementation can look up the length once, and cache it:
- “The length of the subject sequence is obtained using the builtin
len()
function (i.e., via the__len__
protocol). However, the interpreter may cache this value in a similar manner as described for value patterns.” - The description for value patterns “the interpreter may cache the first value found and reuse it, rather than repeat the same lookup. (To clarify, this cache is strictly tied to a given execution of a given match statement.)”. I’m noting that it says “match statement” rather than “case statement”
However, guards are explicitly allowed to have side effects.
Therefore, if we have code like this:
def f(x):
match x:
case [1, _] if (hasattr(x, "pop") and x.pop()):
return 1
case [1, _]:
return 2
case _:
return 3
f([1, 0])
What do we consider an acceptable outcome? I can see three options:
-
return 3
- the second case fails because the length is no longer 2. The is what CPython currently looks to do. -
IndexError
on case 2. The interpreter uses the cached length and then fails to read the second element. - Hard crash on case 2. The interpreter uses the cached length, and directly accesses the memory of the second element.