Make MRO topological sorted

Python MRO is a linear ordering of the base classes that is used to decide the order in which inherited methods are overridden. As mentioned in @guido’s blog:

Basically, the idea behind C3 is that if you write down all of the ordering rules imposed by inheritance relationships in a complex class hierarchy, the algorithm will determine a monotonic ordering of the classes that satisfies all of them. If such an ordering can not be determined, the algorithm will fail.

Unfortunately, sometimes even when there is a consistent monotonic ordering, the current MRO implementation might be failed to find it. For example, the following code will raise an error:

class A:
    pass

class B:
    pass

class A0(A):
    pass

class A1(A):
    pass

class A0_B(A0, B):
    pass

class B_A1(B, A1):
    pass

class A0_B_A1(A0_B, B_A1):
    pass
TypeError: Cannot create a consistent method resolution
order (MRO) for bases A, B

A workaround it to add more hints to the MRO. For example, we can change class A0_B(A0, B) to class A0_B(A0, B, A):

class A:
    pass


class B:
    pass


class A0(A):
    pass


class A1(A):
    pass


# Changed from `class A0_B(A0, B):`
class A0_B(A0, B, A):
    pass


class B_A1(B, A1):
    pass


class A0_B_A1(A0_B, B_A1):
    pass

However, the workaround is not always possible, because the user of A0_B and B_A1 might not be able to change A0_B’s implementation.

I think the best solution is to perform a proper topological sorting when determining the MRO of a class, instead of merging the MROs of the base classes.

This is not a hint, though - it’s an override. You’re changing the resolution order. And since there are multiple conflicting ways that this can be done, it MUST be done explicitly. Without it, A would come before B; you’re overriding that to specify that B comes before A.

Do you have an actual use-case for this, or is it just for toy examples? I’ve never come across any real-world hierarchy in which this linearization has been a problem.

1 Like

Another example here:

from abc import ABC, abstractmethod
from typing import Generic, TypeVar, final
from uuid import UUID

ElementType = TypeVar('ElementType')

class ToReprStr(Generic[ElementType]):
    @final
    def to_repr_str(self, element: ElementType):
        return repr(element)

class LoggingMixin(ToReprStr[UUID], ABC):
    @abstractmethod
    def run(self) -> UUID:
        ...

    @final
    def logged_run(self, source: bytes):
        result = self.run()
        print(self.to_repr_str(result))
        return result

class ToStr(Generic[ElementType]):
    @final
    def to_str(self, element: ElementType):
        return str(element)

class AbstractFromStr(ABC):
    @abstractmethod
    def from_str(self, source: str):
        ...

class UUIDFromStr(AbstractFromStr):
    @final
    def from_str(self, source: str):
        return UUID(source)

class UUIDFromBytes(UUIDFromStr, ToStr[bytes]):
    @final
    def uuid_from_bytes(self, source: bytes):
        return self.from_str(self.to_str(source))

class RunParsingUUID(UUIDFromBytes, LoggingMixin):
    @final
    def run(self):
        return self.uuid_from_bytes(b"id")

Suppose ToReprStr and LoggingMixin are from a library to schedule some task in run and log the result. ToStr, AbstractFromStr, UUIDFromStr, and UUIDFromBytes are classes from another serialization/deserialization library.

Now you want to create RunParsingUUID using both libraries, and you got the error:

TypeError: Cannot create a consistent method resolution
order (MRO) for bases ABC, Generic
1 Like

This is still looking very artificial and contrived. I ask again: Do you have an actual use-case for this?

I would expect to see this sort of code in an enterprisey Java codebase.

Yes it’s an artificial example. I have an actual example but I cannot share it here right now. I am happy to see that you can get the point that there could be some code similar to this.

@davidism Please don’t move this thread to Python Help. It’s not a post to ask help. Instead, it aims to discuss a design issue in C3 linearization. Especially, our current implementation of MRO does not actually fulfill @guido 's idea:

Basically, the idea behind C3 is that if you write down all of the ordering rules imposed by inheritance relationships in a complex class hierarchy, the algorithm will determine a monotonic ordering of the classes that satisfies all of them. If such an ordering can not be determined, the algorithm will fail.

The example clearly shows that the current MRO implementation sometimes can fail to find a monotonic ordering even when the monotonic ordering exists.

Also the issue can be fixed by properly implementing topological sorting in the MRO implementation

So far, all I’ve seen is that you can create toy examples, which can be done with anything. Are you able to create a cut-down version of the actual example?

What do you mean? There is no such ordering with what you have given, and you have to override the normal rules in order to create one. Comparing your example with the one you say adds a “hint”:

>>> A0_B.__mro__
(<class '__main__.A0_B'>, <class '__main__.A0'>, <class '__main__.A'>, <class '__main__.B'>, <class 'object'>)
>>> class A0_B(A0, B, A):
...     pass
... 
>>> A0_B.__mro__
(<class '__main__.A0_B'>, <class '__main__.A0'>, <class '__main__.B'>, <class '__main__.A'>, <class 'object'>)

Note that the second version places B before A. This is a violation of expectation; the obvious linearization of A0(A) followed by B would place A before B. Python allows you to disagree explicitly, but this isn’t something that will happen automatically.

What do you mean? There is no such ordering with what you have given, and you have to override the normal rules in order to create one. Comparing your example with the one you say adds a “hint”:

The purpose of linearization is to let the Python users to specify dependencies between classes, and the Python runtime will find a monotonic ordering. A proper implementation will require the MRO to be delayed determined by topological sorting, for each concrete class at least. For example, A0_B(A0, B) added the following restrictions in MRO:

A0_B -> A0
A0 -> B

Because of A0(A), we also have:

A0 -> A

Note that the user did not specify the relationship between A and B, therefore, ideally the Python runtime should have the flexibility to determine the order between A and B according to @guido 's idea.

Unfortunately, the current implementation is based on merging MROs, not topological sorting, as a result, the order between A and B is not late-bound, resulting the error.

A better linearization implementation should be based on topological sorting.

What is worse is when A is ABC and B is Generic. Because ABC and Generic are independent from each other, users basically have no control to their ordering, and they can create conflicting MRO easily as shown in the RunParsingUUID example.

Another problem of MRO merging is that it violates the assumption of backward-compatibility.

People usually assume that extracting parts of your class into an interface would not break backward-compatibility. Unfortunately, because of the current MRO implementation, the assumption is false.

For example, the following code would not result in an error:

from abc import ABC, abstractmethod
from typing import Generic, TypeVar, final
from uuid import UUID

ElementType = TypeVar('ElementType')

class ToReprStr(Generic[ElementType]):
    @final
    def to_repr_str(self, element: ElementType):
        return repr(element)

class LoggingMixin(ToReprStr[UUID], ABC):
    @abstractmethod
    def run(self) -> UUID:
        ...

    @final
    def logged_run(self, source: bytes):
        result = self.run()
        print(self.to_repr_str(result))
        return result

class ToStr(Generic[ElementType]):
    @final
    def to_str(self, element: ElementType):
        return str(element)

class UUIDFromStr:
    @final
    def from_str(self, source: str):
        return UUID(source)

class UUIDFromBytes(UUIDFromStr, ToStr[bytes]):
    @final
    def uuid_from_bytes(self, source: bytes):
        return self.from_str(self.to_str(source))

class RunParsingUUID(UUIDFromBytes, LoggingMixin):
    @final
    def run(self):
        return self.uuid_from_bytes(b"id")

However, if the author of the serialization library lets UUIDFromStr implement an interface, RunParsingUUID will be broken:

class AbstractFromStr(ABC):
    @abstractmethod
    def from_str(self, source: str):
        ...

class UUIDFromStr(AbstractFromStr):
    @final
    def from_str(self, source: str):
        return UUID(source)

But that’s not MRO breaking compatibility, it’s the people who made a bad assumption about what would happen?

In contrast, changing the behavior from MRO would be massive breaking change.

There are commits in CPython that already broke the backward compatibility.
For example, in this commit, _GeneratorContextManagerBase is introduced, and the order between AbstractContextManager and ContextDecorator are changed, therefore, it is a backward incompatible change.

In fact, I guess the reason why AbstractContextManager and ContextDecorator were swapped is to work around the inconsistent MRO error in tests when introducing _GeneratorContextManagerBase. From my own experience, I sometimes need to randomly adjust the order of super classes in order to pass MRO check, even when the super classes look independent.

Unfortunately, randomly changing the order like the commit did would create more breakage in users’ code.

1 Like

Those changes may have broken compatibility, sure. My point is that’s not possible for the current method of MRO to “break compatibility”. It defines compatibility.

Perhaps what you meant is that a different method would allow changing inheritance in a way that would currently break compatibility. That’s different though.

2 Likes

This thread was created in the Idea board and improperly got moved to Python Help board. The motivation of this thread is to discuss the issues in the current MRO implementation. One of the issues is that the MRO implementation defines compatibility in a counterintuitive way, preventing library authors from keeping backward compatibility when adding features.

1 Like

I understand, but changing the method resolution order is also a backwards-incompatible change, and it affects everyone. Somewhat improving the usability for new code can’t take precedence over that. So changing the method is clearly a non-starter.

Perhaps there could be improved documentation or better tooling to help people avoid those issues.

I think it is possible to change the MRO implementation in a backward-compatible way. Because current MRO implementation is too strict, a looser implementation would not break backward-compatibility as long as it produces the same result for existing working cases.

1 Like

I don’t see how that disqualify is this as a legitimate example. This seems like a completely legitimate way to structure libraries and components within an application.

As for the idea, is it always the case that the current linear MRO is a strict subset of the topologically-sorted MRO? If that’s true, then I don’t see why this would be a breaking change for users.

3 Likes

Inheriting from classes to manage formatting and logging? That isn’t something I see in most Python codebases. And the rules of Java are VERY different. (I’m not going to say whether it’s good or bad in a Java context, although I will say that “enterprisey” is not exactly a compliment.) Hence my repeated call for an actual Python use-case.

I think you are essentially saying “I don’t use multiple inheritance, so any issues in MRO don’t bother me, and any usage of multiple inheritance is not Pythonic to me”.

IMHO, Python is a programming language that allows people to use it in different ways. In particular, built-in classes like ABC and Generic are commonly used in multiple inheritance in modern Python code. I believe the current MRO implementation would prevent ABC and Generic from being used in sophisticated use cases.

That is most definitely not true. Please don’t read into my words what I didn’t put there. I have used MI in many languages, including Python, and I have not had any problem with Python’s MRO rules.