Add option to pre define size of list

Just like in C or Java, an option pre define the size and type of list.
If we have the need to predefine the size of an array we can achieve this in Java.
Java: int [] arr = new int[10] (Create a new array object of type int and size 10)
Similarly, to achieve the same in python we have to use the following code.
Python:
arr: list[int]
arr = [None]*10
To avoid this, we can implement a feature where we can define the size of list while declaration.
Something like: arr: list[int, 10] where size being the optional argument.

That won’t work, type annotations are ignored at run time:

arr: list[int] = 1  # OK
1 Like

Yes, but there must be some way we can achieve this. Defining the size I mean if not the type, while declaration.

Your should allow None in the list:

arr: list[int | None] = [None] * 10

List is a mutable type. It by definition cannot have a static size. Perhaps you’re thinking of tuples?

Currently, to type a size-1 tuple you do tuple[T], size-2 tuple[T, T], etc., you have to hardcode the size into the type parameters. I admit it would be nice to type a size-n tuple as tuple[T, n]. That said I’m not sure what practical benefits this would bring, so I’m not particularly enthusiastic about this idea.

1 Like

The problem is, None consumes 16 bytes of space. Lets say if I want to make an Integer array of size 10, I would need approximate 40 bytes (int = 4 bytes). But if I have to pre define the size of my array, I would use None, which would eat up 160 bytes (None = 16 bytes).

Then use 0 or -1 as a sentinel?

arr: list[int] = [-1] * 10

Eh, it uses less bytes:

>>> import sys
>>> sys.getsizeof(None)  # 16
>>> sys.getsizeof(1)  # 28
>>> sys.getsizeof(10 ** 100)  # 72

Python integers have a variable size.

1 Like

Do we move this to help?

I agree. I just moved it.

2 Likes

No, it works completely differently from how you expect.

First off, there is no “array”. We call them lists. There are many things available in Python that can be called “array”, but none of them is what you make with code like [], and they are quite different from each other.

Second, lists - just like any other native Python container - don’t store the values directly. They’re referenced. No matter what things you put in a list (and they do not have to be the same type), the list will use a fixed amount of memory per element. (It can automatically “reserve” space for additional elements, so that resizing is less frequent). It is not at all like a Java int[] or String[] etc. It is more like java.util.ArrayList, or C++ std::vector<void*>. (In fact, in the C implementation, there is a base type PyObject for everything, so actually PyObject * are stored; and of course it is C not C++, so all the “vector” logic is implemented manually.) And, you know, this is why it usually is pointless to pre-size the list: because normally we don’t write logic that requires a specific size. If you need to do that kind of work a lot, especially for number-crunching, please consider NumPy.

Third, there is only ever a single None object, and a list that stores None multiple times, just points at it repeatedly. It will not create new objects.

Fourth, while Python can also reuse objects for integers, they are still Python objects. And on a 64-bit implementation, they use at least 28 bytes each.

In recent versions, on 64-bit, the list of 10 objects will take 136 bytes if you pre-size it: 56 for the list object itself (== 16 to be included in a double-linked list, + 16 for the standard “header” of Python objects, + a word each for the size, capacity and pointer to the allocation) plus 80 bytes for the actual allocated storage (10 pointers). If you build it with a loop, the allocation will end up with a current capacity of 16 elements, so another 48 bytes used. The integers that you store will probably have already existed, and None certainly will.

7 Likes

There is of course, internally to CPython, and it is sized and resized for you (just like a Java ArrayList). But that’s an implementation detail, and there is no way I know of to hint to a list the allocation it should make apart from some variant of the [x]*N trick.

@ammar26627 : I feel sure what you want is: the array module, or perhaps to install NumPy (as Karl suggests) if you have a lot of numerical work.

Constructing a list from anything whose length is known will work. For example, list(range(10)) does the trick - even though a range doesn’t store all the elements, it can determine in O(1) how many elements there should be. However, this doesn’t work with a list comprehension (yet?).