The docs no longer tell the truth about what happens when a mutable sequence changes length during iteration, and what actually happens varies now across the specific mutable sequence type,.
This acts closest to what the docs probably mean by
There’s no guessing what “for the most part” might mean, but bytearray probably illustrates the intent:
>>> xs = bytearray([1, 2])
>>> xs.extend(xs)
>>> xs
bytearray(b'\x01\x02\x01\x02')
>>> xs = bytearray([1, 2])
>>> xs.extend(iter(xs))
>>> xs
bytearray(b'\x01\x02\x01\x02')
So in both cases the length of the bytearray is doubled, It’s very different if you use a list (or array.array) instead Then extending a list with itself again doubles length, but extending with iter(xs) goes on and on until you run out of RAM.
That’s surprising by itself, and then it’s doubly surprising that bytearray acts differently than the original mutable sequence types (list and array.array).
The list behavior was certainly intended (I was there at the time
), but the brief text about it (in the Reference manual’s explanation of the “for” statement) appears to have been purged when the iterator protocol was introduced. But was replaced with fuzzy hand-waving.
If it seems surprising that iteration could rely on length-changing mutations during iteration, this is a quite common way to do a breadth-first search:
for node in frontier:
for s in successors_of(node):
if is_interesting)(s):
frontrrr.append(s)
It inherently relies on that iteration will reliably discover elements appended during iteration. Nothing wrong with it. And this spelling works regardless of whether list or bytearray is the type of “frontier”. That bytearray captures the length just once at the start is (seemingly) unique to bytearray.extend().
For a concrete real example where it matters, this is an elegant way to construct all the divisors of an int n, given its prime factorization as a dict mapping primes to their exponents:
def divisors(pe):
import math
from itertools import islice
expect = math.prod(e + 1 for e in pe.values())
ds = [1]
for p, e in pe.items():
ds.extend(d * p for d in islice(ds, e * len(ds)))
assert len(ds) == expect
return ds
Then, e.g.,
>>>> divisors({2: 2, 5: 2}) # divisors of 100
[1, 2, 4, 5, 10, 20, 25, 50, 100]
This is worth studying: It’s a remarkably easy way to construct the result, with no shenanigans involving exponentiation and/or itertools.product. Just one multiply per final divisor.
Initialize to a bytearray([1]) instead, and it dies with an assertion error. Comment that out, and it only constructs the divisors 1, 2, 5, and 10. The islice argument is materialized before extend begins appending, not al all the intent.
Which is akin to how I stumbled into it, when changing some very old code to use bytearrays “to save RAM”.
I don’t believe any of this can be deduced from the current docs. But I don’t think anything about behavior can be changed now.
bytearrayhas acted this way since its introduction I considerer it to be a design error, but too late to change now. Or do you think it could?listhas acted this way forever, and mounds of code relies on it.