The logic of `solid_base`

In CPython typeobject.c there is a function called solid_base. Solid base (probably meaning the same thing) is also an important concept when discussing the acceptable bases of a type as it is being defined.

The code has always puzzled me a bit. cpython/Objects/typeobject.c at a89ee4b9c2a87d9bdf105883f834cda9d943d541 · python/cpython · GitHub

Can someone confirm my understanding please? I think:

The “solid base” of a type T is a type S that is the most distant ancestor of T along the chain of bases of T, including T itself, that defines the same representation as T in memory for its instances.

I notice that recent versions of this call shape_differs, where it used to call extra_ivars, which I think is to do with simplifications to what we mean by “same representation”. However, it always seemed to me that the code chased the base chain needlessly all the way to object, before walking back to the first point where shape_differs returns TRUE, and reporting the type whose base differs in shape.

It bugged me enough to verify this in a little Python model.

Why is it not sufficient to chase base pointers only until the condition shape_differs(type, base) is met (or object is reached)?

As far as i know (don’t think i know too much, i dont really), Solid Bases are classes who themselves define their memory layout (somewhat exposed via __slots__, see PEP 800 – Disjoint bases in the type system]) in a way that it cannot be changed lateron. That means that if i have a class C(A, B): ..., and both are disjoint bases / solid bases, this code will not work, as the memory layout of A would be changed by B.

The basic idea behind Solid bases is, that any class C cannot have more that one solid base B as a base-class.

I agree that this is not really documented as good, not even in the source code. Therefore i’d be in favor of updating docs to add more about Solid Bases and the idea behind them.

1 Like

I noticed PEP 800 and was glad they chose another term, although it is very close as a concept.

If the layout of instances of B extends the layout of instances of A, then class C(A, B) would be allowed. Let’s say there are several classes between A and B in the hierarchy. solid_base is a humble local function that finds the class between A and B where the (last) extension happens, so it’s the one that defines B’s shape. Methods on A are ok handling an instance of B because they only look at the A-shaped part.

It is used in some more complex reasoning in the definition of type.__new__.

I think I follow what it does, but the obscure implementation makes me think I might be missing some subtlety.

1 Like