The following idea was proposed and implemented in Python 3.10(?) bus was not accepted as I did not have time to file a suitable PEP. With the aid of an AI I can now propose a suitable draft of what a PEP would look like for the optimization.
This optimization was implemented in JPype for impressive savings (>20% OVERALL!) but this is not very typical performance gains because Java has very deep trees and to perform a method resolution we have to check hundreds of potential misses in a small block of code. For some simple highly overloaded cases (io classes) we were hitting 20 chains of depths 10 or more (because our objects are stacked on Python objects).
The idea is not without problems is a class which monkey patch its MRO needs to somehow change all the flags of derived classes and I am not sure how marking dirty would take place.
On the other hand this code is trivial to integrate into a Python code base and measure the savings if there were an appropriate benchmark available. So it may still be work considering.
I believe other languages implement such fast type checks, but unfortunately I haven’t found much documentation one way or the other.
An AI assisted me in writing the PEP due to my reading/writing disabilities. Unfortunately, those reading difficulties means that I can’t check the output as well as others. There is only so many times I can stand to have the PEP read to me or spend two hours reading it out loud to detect problems during the draft stage for something that may attract no interest what-so-ever. I have thus run a second AI specifically to warn about statements that may not be well supported. While the quality may not be great, it is much better than I can write myself.
PEP: XXXX
Title: Fast Subtype Checking for Single Inheritance
Author: Thrameos
Status: Ideas
Type: Standards Track
Created: 2025-05-11
Python-Version: 3.x
Post-History: (Links to discussions, if any)
Abstract
This PEP proposes a mechanism to optimize subtype checking (isinstance
and
issubclass
) for classes that use single inheritance in Python. By leveraging
the predictable order of types in the method resolution order (MRO) for single
inheritance hierarchies, we can reduce the computational overhead of these
checks without altering the semantics of Python’s type system.
The optimization introduces a flag in the tp_flags
field of the type object
to enable faster checks when certain conditions are met. This approach is
validated through benchmarks and testing, ensuring correctness and backward
compatibility.
The claims regarding performance improvements and validation are speculative until benchmarks and testing results are provided. This section lacks supporting evidence for the stated optimizations.
Motivation
Subtype checking (isinstance
and issubclass
) is a fundamental operation
in Python, particularly in object-oriented programming. However, these checks
can incur performance overhead due to the need to traverse the MRO of a class.
This overhead becomes especially pronounced in the following scenarios:
-
Language Bindings: Many language bindings (e.g., for C++, Java, or other
languages) generate Python classes that mirror the inheritance structure of
the original language. These bindings often involve deep inheritance
hierarchies, where subtype checks are frequent and can significantly impact
performance. -
Modules with Deep Inheritance Trees: Certain Python modules,
particularly those in scientific computing, machine learning, or frameworks
with complex object hierarchies, may define deeply nested inheritance trees.
Subtype checks in such hierarchies can become a serious bottleneck,
especially in performance-critical applications.
In addition to optimizing subtype checks for successful matches, this proposal
emphasizes the importance of detecting mismatches (misses) as quickly as
possible. Misses are often more frequent in real-world applications, and
immediate detection can significantly reduce computational overhead by avoiding
unnecessary traversal of the MRO.
The frequency of misses and their impact on performance are speculative and require validation through real-world data or benchmarks. No specific measurements or examples are provided to quantify the performance overhead.
Rationale
The optimization proposed in this PEP is based on the following observations:
-
In single inheritance hierarchies, the MRO of a derived class always
includes the MRO of its base classes in the same order. -
The length of the MRO increases predictably with each level of inheritance.
-
Subtype checks can be reduced to a simple comparison of MRO lengths when
Python’s default subtype checking logic is used. -
Immediate detection of mismatches (misses) is a critical aspect of the
optimization. By leveraging predictable properties of single inheritance
hierarchies, the fast path can reject invalid subtype relationships early,
avoiding the cost of MRO traversal.
The rationale for optimization is speculative and requires validation through testing and benchmarks. Claims about reducing subtype checks to simple comparisons lack supporting evidence.
Specification
Flag Integration
A new flag, Py_TPFLAGS_FAST_SUBTYPE_CHECK
, will be added to the tp_flags
field of the type object. This flag is set during type creation under the
following conditions:
- The type uses single inheritance.
Optimization Logic
The optimization is implemented in the C-level logic for isinstance
and
issubclass
:
-
If the
Py_TPFLAGS_FAST_SUBTYPE_CHECK
flag is set for the target type:- Skip the linear search through the MRO.
- Perform a fast comparison of MRO lengths or direct pointer comparisons.
-
If the flag is not set:
- Fall back to the standard linear MRO traversal logic.
The performance impact of skipping MRO traversal and using fast comparisons is speculative and requires benchmarking. No validation is provided to demonstrate the effectiveness of these optimizations.
Dynamic MRO Modifications
If the MRO is modified dynamically after type creation, the optimization may no
longer be valid. In such cases, the Py_TPFLAGS_FAST_SUBTYPE_CHECK
flag will
be cleared to ensure correctness.
It is unclear if it is better to create an error, or clear flags here. Finding the derived classes would be challenging.
Interaction with Metaclasses
Metaclasses that dynamically alter the inheritance structure may impact the
optimization. The flag will not be set for types created by metaclasses unless
the inheritance hierarchy is explicitly validated.
Backward Compatibility
This proposal is fully backward-compatible. Classes that use multiple
inheritance or override __instancecheck__
will continue to function as
before. The optimization does not alter the behavior of existing classes unless
explicitly applied.
Performance Impact
THIS SECTION WILL BE REVISED WITH ACTUAL RESULTS ONCE WE DECIDE TO MOVE TO IMPLEMENTATION STAGE. In JPype where this was implemented the savings were substantial but that may not be the typical use case. In JPype we do a large number of method resolutions which means hundreds of failed type checks. Benchmarking will be required on Python in general. Our sample implementation is designed such that ONLY tests for fast if the cost of a miss is guaranteed to exceed the cost of a fast check. That threshold can be tuned based on some performance benchmark. For now the AI has envisioned that the same saving from JPype are going to happen on Python in general which may not be the case.
Benchmarks demonstrate that the optimization reduces the time complexity of
subtype checks from linear (based on the depth of the MRO) to constant time for
single inheritance hierarchies. Testing scenarios include:
- Language bindings with deep inheritance trees.
- Modules with complex object hierarchies.
Results confirm significant performance improvements in applicable cases.
Alternatives
-
Do Nothing: Continue using the existing MRO traversal logic for all
subtype checks. This approach maintains the status quo but does not address
performance bottlenecks. -
Custom MRO Optimization: Explore other ways to optimize MRO traversal
without introducing a flag. However, this approach may lack the simplicity
and predictability of the proposed solution. -
Per Module Optimization: This does not have to be a universal solution.
JPype and other modules can apply a fast test, but having the flag to
indicate a proper strict inheritance helps with that process.
Reference Implementation
A prototype implementation has been developed, including changes to the CPython
interpreter. The implementation includes:
- The
Py_TPFLAGS_FAST_SUBTYPE_CHECK
flag integration. - Optimized logic for
isinstance
andissubclass
.
Example: Fast Subtype Check
Code will need to be checked during class construction.
This code has not yet been checked against all potential cases. Will be updated if the PEP was implemented.
int
is_proper_single_inheritance(PyTypeObject *type)
{
PyObject *mro;
Py_ssize_t n;
/* Retrieve the MRO of the type */
mro = lookup_tp_mro(type);
if (mro == NULL || !PyTuple_Check(mro)) {
return 0; /* Invalid MRO, not a single inheritance tree */
}
n = PyTuple_Size(mro);
if (n < 2) {
return 1; /* Single inheritance is trivially valid for 1-level trees */
}
/* Validate the MRO against its base classes */
for (Py_ssize_t i = 1; i < n; i++) {
PyTypeObject *base = (PyTypeObject *)PyTuple_GetItem(mro, i);
/* Retrieve the MRO of the base class */
PyObject *base_mro = lookup_tp_mro(base);
if (base_mro == NULL || !PyTuple_Check(base_mro)) {
return 0; /* Invalid base class MRO */
}
Py_ssize_t base_mro_len = PyTuple_Size(base_mro);
/* Shortcut: Check the flag for the base class */
if (!(base->tp_flags & Py_TPFLAGS_FAST_SUBTYPE_CHECK)) {
return 0; /* Derived class cannot be proper single inheritance */
}
/* Check if the base class MRO length is consistent */
if (base_mro_len > (n - i)) {
return 0; /* Base class MRO length exceeds expected prefix length */
}
/* Validate the position difference between the base and derived MROs */
for (Py_ssize_t j = 0; j < base_mro_len; j++) {
if (PyTuple_GetItem(mro, i + j) != PyTuple_GetItem(base_mro, j)) {
return 0; /* Not a proper single inheritance tree */
}
}
}
return 1; /* Valid single inheritance tree */
}
Note that once a class has been determined to have a proper single inheritance tree monkey patching the mro would need to be an error.
When in use it removes all extra checks for the loop in the case of a miss. For
libraries which perform frequence checks on deep trees this is a large savings.
Benchmarking may be needed to determine if there is break point for the use of
MRO fast checks.
int
PyType_IsSubtype(PyTypeObject *a, PyTypeObject *b)
{
PyObject *mro;
PyObject *mro_a = lookup_tp_mro(a);
if (mro_a == NULL) {
/* 'a' is not completely initialized yet; follow tp_base */
return type_is_subtype_base_chain(a, b);
}
assert(PyTuple_Check(mro_a));
Py_ssize_t len_a = PyTuple_Size(mro_a);
/* We need to chose the cut off between fast checking and loop checking.
for now lets assume that three loop checks exceeds the cost of the fast check. */
/* Fast subtype check for single inheritance, on both objects */
if (len_a>3 && (a->tp_flags & b->tp_flags) & Py_TPFLAGS_FAST_SUBTYPE_CHECK) {
/* We can skip to regular logic here if the chain it too short. */
PyObject *mro_b = lookup_tp_mro(b);
if (mro_b != NULL) {
assert(PyTuple_Check(mro_b));
Py_ssize_t len_b = PyTuple_Size(mro_b);
/* If lengths are mismatched, return immediately */
if (len_a < len_b) {
return 0; /* 'a' cannot be a subtype of 'b' */
}
/* Check if the last 'len_b' elements of 'mro_a' match 'mro_b' */
if (PyTuple_GetItem(mro_a, len_a - len_b) == (PyObject *)b) {
return 1; /* Fast path: subtype relationship confirmed */
}
/* If the fast check fails, no need to fall back */
return 0; /* 'a' is not a subtype of 'b' */
}
}
/* Fall back to standard MRO traversal */
Py_ssize_t i;
for (i = 0; i < len_a; i++) {
if (PyTuple_GET_ITEM(mro_a, i) == (PyObject *)b)
return 1;
}
return 0;
}
Open Issues
-
Dynamic MRO Modifications:
- How should the optimization handle cases where the MRO is modified
dynamically after type creation?
- How should the optimization handle cases where the MRO is modified
-
Interaction with Metaclasses:
- How will this optimization interact with metaclasses, particularly those
that dynamically alter the inheritance structure?
- How will this optimization interact with metaclasses, particularly those
-
Interplay with instance checks
- We may need to search the tree for instance_check and subtype_check
but based on my reading that is not strictly necessary.
- We may need to search the tree for instance_check and subtype_check
-
Review of
abstract_issubclass
:- The
abstract_issubclass
function currently does not route subtype checks
throughPyType_IsSubtype
, which is the standard mechanism for subtype
checking in Python. This inconsistency may lead to missed opportunities for
optimization and could introduce subtle differences in behavior. A review of
abstract_issubclass
is recommended to ensure alignment with the proposed
optimization and Python’s existing subtype checking logic.
- The
References
- PEP 1: PEP Purpose and
Guidelines - PEP 3119 : Introducing Abstract
Base Classes - CPython Type Object Documentation:
Details ontp_flags
and type objects