Why is array slower to index than list?

Programming FAQ for Python2 and Programming FAQ for Python3:

The array module also provides methods for creating arrays of fixed types with compact representations, but they are slower to index than lists.

To get an element of a list, we need an additional dereferencing; whereas the elements of array are arranged much more compactly than list and they are fixed type, why is it slower instead?

Arrays are indeed more compact, but the problem is that to make it so, it just stores the raw values. When you index, the array needs to create a new integer/float object to return the value to you. It wouldn’t be a significant cost, but it is greater than list, which already has all the objects (since it just stores references to those).

3 Likes

That doesn’t seem to be the whole reason, though. With lst = list(range(2560)) and arr = array('i', lst) I get these times in ns:

11 12 12 12 12 lst[420]
50 50 51 51 51 arr[420]
33 34 34 34 34 -lst[420]

Note that -lst[420] also creates a new object and evaluates the negation operation and is still much faster than the array access.

Script
from timeit import timeit

setup = '''
from array import array

lst = list(range(2560))
arr = array('i', lst)
'''

codes = ['lst[420]', 'arr[420]', '-lst[420]']

for i in range(6):
    times = {c: [] for c in codes}

    for c in codes * 100:
        t = timeit((c+';')*10, setup, number=10**4) * 1e4
        times[c].append(t)

    if i >= 3:
        for c in codes:
            print(*map(round, sorted(times[c])[:5]), c)
        print()

Attempt This Online!

1 Like

I don’t know. Is this still the case? Have you measured it? (I have
not.)

I might speculate that lists have been optimised to accept Pythin ints
efficiently, and that effort hasn’t been expended for arrays. But it’s
only speculation.

Maybe there’s also a cost to promote the low level machine ints in an
array into a Python int object for every object returned: that would
accrue.

A list:

 L = [1, 2, 3]
 x = L[1]

This just accesses L and binds its second element to x.

An array:

 A = array('L', [1,2,3])
 x = A[1]

This has to fetch the second value from the array and make a new int
to contain it, then bind that to x.

I can imagine that to be a bit more expensive.

But really, I’d want to measure this stuff to see where the costs lie.

1 Like