Strange behavior of sys.getsizeof

Hi! I expected that sys,getsizeof would return the same results in the following two scripts, but it returns different results. Why?

Script 1

import sys
a = []
a += [0]
print(a)
print(sys.getsizeof(a))

Output of script1

[0]
72

Script 2

import sys
a = []
a.append(0)
print(a)
print(sys.getsizeof(a))

Output os script 2

[0]
88

sys.getsizeof is applied to the same object in both cases, but returns different results! Why?

1 Like

When the new element is added to the list, the size of the list needs to be increased. Python slightly overallocates is some cases to avoid resizing for each element.

In your first example cpython/Objects/listobject.c at 0c26dbd16e9dd71a52d3ebd43d692f0cd88a3a37 · python/cpython · GitHub is used, but in the second example this cpython/Objects/listobject.c at 208d06fd515119af49f844c7781e1eb2be8a8add · python/cpython · GitHub code path is taken. So in each script a single element is added, but in the second one the new list is over allocated a bit.

2 Likes

Their first case also overallocates.

Yes. Applying getsizeof to the empty list returns 56 (not 72), and the lists to which I applied getsizeof in both cases were not empty (consisting of one element, ‘0’), so it looks like overallocation occurs in both cases, but why it adds different numbers of extra cells?

Here is an additional experiment proving that overallocations occurred in both cases:

import sys
a = []

for _ in range(5):
    a += [0]
    print(sys.getsizeof(a), end = ' ')

this returns: ‘72 72 120 120 120’, which means that in the first case, after execution of ‘a += [0]’ for the first time, we got 2 extra cells.

After replacing ‘a += [0]’ with ‘a.append(0)’ in this fragment, we get ‘88 88 88 88 120’, which means that in the second case, after execution of ‘a.append(0)’ for the first time, we got 4 extra cells.

So we get overallocations in both cases, but we get different numbers of extra cells.

Well, apparently it’s assumed that append will be done often, so you really want overallocation.

And apparently it’s assumed that += will be the only one or that the next extension will be large enough that it needs resizing anyway, so it only calls list_preallocate_exact. Which only overallocates by at most a single element, and only for alignment reasons. See the explanation in its source code.

2 Likes

Now I understand. Thank you, Stefan!