We hit a fascinating regression in Werkzeug today and you’ll never believe the fix!
This really seems like a gotcha (if not an outright bug?) in the language. It seems like an object’s behavior should remain exactly the same when a subclass implements a method that merely delegates to super()’s implementation. Fast-path optimizations should not cause entirely different results.
Would anyone with expertise in dict internals be able to chime in on this?
From old days, dict subclasses used the “fast-path” always. To fix the issue, dict subclasses use the “fast-path” when they don’t override __iter__.
Maybe, we need to test __getitem__ too, or use fast-path only for exact dict, not subclasses.
But I don’t know why this is regression. Dict subclasses which dont override __iter__ use fast-path before/after the fix. So there is no behavior change for them.
Please see https://repl.it/@jab/dict-subclass-copy-surprise for a more minimal reproduction of this issue (pasted below as well). If you run as-is, you’ll see the unexpected behavior trigger an AssertionError. If you change SHOW_BUG to False, you won’t. Is it intended that toggling the value of SHOW_BUG in this code causes different results?
Thanks,
Josh
class Parent:
def __init__(self, value):
self._value = value
def method(self):
return self._value
class Child1(Parent):
pass
c1 = Child1(42)
result = c1.method()
assert result == 42, result
class Child2(Parent):
def method(self):
return super().method()
c2 = Child2(42)
result = c2.method()
assert result == 42, result
# You might think that for all "Parent" classes,
# for all "method"s, Child1.method should return
# the same result as Child2.method.
# But when "Parent" is "dict" and method is "__iter__",
# that is not the case:
SHOW_BUG = True
class ChildDict1(dict):
"""Simplification of werkzeug.datastructures.MultiDict."""
def __init__(self):
pass
if not SHOW_BUG:
def __iter__(self):
return super().__iter__()
def add(self, key, value):
self.setdefault(key, []).append(value)
def __setitem__(self, key, value):
"""Like add, but removes any existing key first."""
super().__setitem__(key, [value])
def getall(self, key) -> list:
return super().__getitem__(key)
def __getitem__(self, key):
"""Return the first value for this key."""
return self.getall(key)[0]
def items(self, multi=False):
for (key, values) in super().items():
if multi:
yield from ((key, value) for value in values)
else:
yield key, values[0]
def values(self):
return (values[0] for values in super().values())
# Remaining overridden implementations of methods
# inherited from dict are elided for brevity.
cd1 = ChildDict1()
assert dict(cd1) == {}
cd1[1] = "one"
cd1.add(1, "uno")
assert cd1.getall(1) == ["one", "uno"]
assert list(cd1.items()) == [(1, "one")]
assert list(cd1.values()) == [ "one"]
assert dict(cd1) == {1: "one"}, cd1 # this line triggers the bug
The “regression” refers to Werkzeug, not Python, and occurs in the Werkzeug v2 release candidate. The issue did not occur in the stable releases of Werkzeug because the data structures that subclassed dict had their own implementation of __iter__.