C API: Each reference should be backed by a pointer?

encukou · August 15, 2022, 1:26pm

Hi,
When reviewing PEP 683, I realized that CPython is probably (accidentally?) relying on the following assumption/rule to prevent overflowing of the refcount field (to negative numbers, currently):

For each strong reference (i.e. between incref & decref), the referenced object should be stored in at least one PyObject* pointer that isn’t shared with other references.

If that is followed, the refcount can’t overflow as there is not enough addressable memory for the pointers.
I think this is followed in practice – there’s nothing preventing you from increfing 1000 times if you then decref 1000 times, but who’d do that?

I’m planning to clarify refcounting docs, and I’d like to include this in a new explanation of what a “reference” is. With friendlier wording of course – maybe even go with “a reference is a pointer (with special rules)” to make the tutorial start with something more “tangible”.
Formalizing it would also make it easier to reason about refcounting changes like PEP 683 – its current “If that were an actual problem then we would have heard about it” is actually fine, but “cannot happen with well-behaved code” would be better :‍)