Reduce metaclass conflicts with less eager proclamation, automatic metaclass merges

Motivation

In an ideal world of OOP classes should behave like building blocks–put two of them together, and as long as they are shaped to fit each other, you get something larger with functions of both.

While that’s generally true with Python’s multiple inheritance, in which developers are entrusted to design base classes that can work well with others (by calling super() and dealing with the returning value appropriately, for example), it is unfortunately not the case with base classes that involve multiple metaclasses, where Python would treat developers with kid gloves, eager to declare “metaclass conflict”, even when the developers know full well that there is no actual conflict among the metaclasses involved, and even when those metaclasses are well written to work with others.

The solution to a metaclass conflict is well known, by creating a new metaclass that inherits from all the metaclasses involved. Some have even created recipes (such as this and that) to automate the monotomous metaclass merging process.

But to me, all of the workarounds and resulting code clutters would’ve been unnecessary had Python not been so eager to declare “metaclass conflict” prematurely in the first place, or had Python built a conflict resolution mechanism internally. To me, the only legitimate “metaclass conflict” that Python should lay a ground rule on is for metaclasses with different custom namespaces prepared. That’s it. All the other usages of multiple metaclasses should work just like multiple inheritance does, where developers are expected to take the responsibility of making classes cooperative.

This would solve the problem in a more generic fashion than my other recent proposal to Make abc.ABC a regular class by making __instancecheck__ and __subclasscheck__ class methods, which aimed only at a specific (although the biggest) use case of abc.ABC.

The Proposal

Python should automatically create a merged metaclass with which to create a new class when there are multiple non-type metaclasses in the base classes in the class definition.

Prepared namespaces are merged in the order of MRO, and Python should raise a TypeError only if there are multiple non-dict types of namespaces prepared by the metaclasses involved.

The implementation of the automatic metaclass merging process should be something logically similar to (somewhat adapted from this recipe):

class CombineMeta:
    def __prepare__(self, name, bases, **kwargs):
        namespace = {}
        for metaclass in self._get_most_derived_metaclasses(bases):
            ns = metaclass.__prepare__(name, bases, **kwargs)
            if type(ns) in (dict, type(namespace)):
                namespace.update(ns)
            else:
                if type(namespace) is not dict:
                    raise TypeError('metaclass conflict: '
                                    'multiple custom namespaces defined.')
                ns.update(namespace)
                namespace = ns
        return namespace

    def __call__(self, name, bases, namespace, **kwargs):
        metaclasses = self._get_most_derived_metaclasses(bases)
        if len(metaclasses) > 1:
            merged_name = '__'.join(meta.__name__ for meta in metaclasses)
            ns = self.__prepare__(merged_name, metaclasses)
            metaclass = self(merged_name, tuple(metaclasses), ns, **kwargs)
        else:
            metaclass, = metaclasses or (type,)
        return metaclass(name, bases, namespace, **kwargs)

    @staticmethod
    def _get_most_derived_metaclasses(bases):
        metaclasses = []
        for metaclass in map(type, bases):
            if metaclass is not type:
                metaclasses = [other for other in metaclasses
                        if not issubclass(metaclass, other)]
                if not any(issubclass(other, metaclass)
                        for other in metaclasses):
                    metaclasses.append(metaclass)
        return metaclasses

combine_meta = CombineMeta()

so that:

from abc import ABC
from enum import Enum

class Foo(ABC, Enum, metaclass=combine_meta):
    BAR = 1

Foo.register(int)
print(issubclass(int, Foo)) # outputs True
print(Foo.BAR.value) # outputs 1
print(type(Foo)) # outputs <class '__main__.ABCMeta_EnumType'>

But with the mechanism above built-in to the language, one can then simply write:

class Foo(ABC, Enum):
    BAR = 1

and the code would work out of the box like putting together building blocks.

Some purists may find the implict name of the merged metaclass irking, but there’s precedent in implicit attribute names created through name mangling. As long as the behavior is documented I don’t think it’s a problem.

Backward Compatibility

There should be no backward compatibility issues because inheriting from multiple classes with distinct metaclasses currently produces a TypeError.

Performance Impact

The automatic metaclass merging process will incur a small overhead in creation of classes with base classes. Since class creation usually occurs only during module loading, the impact to overall performance should be minimal.

Yes…

… but no. What you describe in your opening isn’t what I would call inheritance. You don’t just arbitrarily say that this thing is both a meteorite and a floating-point number and expect that it will magically behave like a combination of them. Composition makes far more sense when you’re taking arbitrary classes and expecting them to work together.

1 Like

OK, forget about the bad analogy. The demand for multiple inheritance involving multiple metaclasses is real, however. Let’s solve the actual issue and make mixins intuitive to use.

Trouble is, it’s not just an analogy - it’s a summary of the need. And that’s a need that I usually only see when people are trying to use inheritance to combine things that really shouldn’t be inheriting. You give an example of something that is simultaneously an abstract base class AND an enumeration - what is the meaningful concept that this represents? What kind of a thing is both of these? My counter-example of a thing that is both a physical object and a number is similarly nonsensical, and was deliberately chosen.

The links you gave of people trying to solve this problem are entirely abstract and arbitrary. What is the actual real use-case where you have conflicting metaclasses and a useful, meaningful subclass of both of them? In all my years with Python, I have seen a grand total of zero such examples.

1 Like

No, you have a difference for classes that use metaclasses where one inherits from the other.

Before your change:

class M1(type):
    pass


class M2(M1):
    pass


class A(metaclass=M1):
    pass


class B(metaclass=M2):
    pass


class C(A, B):
    pass


assert type(C) is M2

The assertion is true.

After your change it raises a MRO error when you try to create a new metaclass that inherits from both M1 and M2.

If I use class C(B, A) instead of class C(A, B) the MRO works, but then the assertion is false.

1 Like

Again, the fact that abc.ABC is designed to create mixin classes and is a metaclass itself makes real-world examples easy to imagine, but not necessarily easy to find in the real world precisely because the risk of metaclass conflicts makes developers ditch the good OOD principle of designing mixin classes with abstract methods.

Consider the example I brought up in my other proposal, where I pointed to the helper mixin class I created to simplify the logics of making a class a context manager, leaving __with__ an abstract method for a subclass to implement:

from abc import ABC, abstractmethod
from contextlib import contextmanager

class ContextManagerMixin(ABC):
    def __init__(self):
        self._contexts = []

    def __init_subclass__(cls):
        cls.__with__ = contextmanager(cls.__with__)

    @abstractmethod
    def __with__(self): ...

    def __enter__(self):
        context = self.__with__()
        self._contexts.append(context)
        return context.__enter__()

    def __exit__(self, exc_type, exc_value, traceback):
        return self._contexts.pop().__exit__(exc_type, exc_value, traceback)

Now if a developer wants to create a base class of a general resource manager that uses a metaclass for a dynamic class factory code pattern, and wants to make the base class a context manager, he/she can choose to implement __enter__ and __exit__ himself/herself, or can use the helper mixin class to do so. With Python proclaiming a metaclass conflict when he/she tries to use the helper mixin class, he/she resorts to implementing __enter__ and __exit__ on his/her own, missing out on reusing a good piece of code.

Multiple inheritance isn’t just about composition, making multiple "is a"s true, but more often about using mixins to incorporate multiple distinct features into a class to maximize code reusability. With abc.ABC being a metaclass often standing in the way, this proposal can make abstract class patterns common again.

Great catch. I’ve updated my reference implementation with the following filter to keep only the most derived metaclasses then:

            metaclasses = [other for other in metaclasses
                if not issubclass(meta, other)]
            if not any(issubclass(other, meta) for other in metaclasses):
                metaclasses.append(meta)

Demo of the code passing the assertion: kpd4pi - Online Python3 Interpreter & Debugging Tool - Ideone.com

Using ABCs for this is not the standard Python approach, though. Protocols like this are typically (and correctly, IMO) defined using duck typing, simply by omitting the abstract method from the mixin, and not making it an ABC.

To me, this example simply shows that (Python’s approach to) ABCs aren’t appropriate for this use case, not that there is a problem with how the ABC class (or the meta class mechanism) works. Other languages use ABCs differently, and much more commonly - often combined with an “everything is a class and subtyping is pervasive” approach. But that’s not the case with Python.

4 Likes

Protocols have the same issue. They also have a metaclass.

Concretely, I think the goal in this instance is to:

  • create an enum
  • inherit methods from an ABC

This seems like a not unreasonable goal? After all, inheriting from an ABC is the recommended solution to implement many protocols, which contain many methods but only need a couple to be implemented directly. It would work fine if that ABC didn’t have a metaclass, i.e. if the ABCs weren’t trying to do two orthogonal things.

2 Likes

Is it? I’ve never done it. An enumeration is a collection of predefined things. In what circumstances do they need to be given methods based on an ABC? I asked for a concrete example, an actual use-case. I mean, to be fair, English doesn’t always have a very concrete use-case for its words (why do we have a word for “throwing someone out of the window” but not for “the day after tomorrow”??), but still, it would help.

Here is also an IMO perfectly valid example where the current system breaks without using ABC (since that seems to be controversial):

@dataclass_transform()
class SlotDataclassMeta(type):
    def __new__(mcs, name, bases, attrs):
        # Don't automatically transform children further down the line.
        # This should probably be a smarter check.
        if any(is_dataclass(base) for base in bases):
            return super().__new__(mcs, name, bases, attrs)
        # Yes, this is too simplistic, but could be extended to work properly 
        attrs["__slots__"] = tuple(attrs.get('__annotations__', {}))
        cls = super().__new__(mcs, name, bases, attrs)
        cls = dataclass(cls)
        return cls


class Card(metaclass=SlotDataclassMeta):
    value: Literal[2, 3, 4, 5, 6, 7, 8, 9, 10, 'J', 'Q', 'K', 'A']
    suite: Literal['heart', 'diamond', 'club', 'spade']

Here we define a new dataclass_transform that correctly sets the __slots__ of the class without creating a subclass. AFAIK, this isn’t possible without a metaclass since we need to take action before the class is created.

class PlayingCards(Card, Enum):
    TWO_OF_SPADES = 2, 'spade'
    TWO_OF_DIAMOND = 2, 'diamond'
    TWO_OF_CLUB = 2, 'club'
    TWO_OF_HEART = 2, 'heart'
    THREE_OF_SPADES = 3, 'spade'
    THREE_OF_DIAMOND = 3, 'diamond'
    THREE_OF_CLUB = 3, 'club'
    THREE_OF_HEART = 3, 'heart'
    # ...

Now if we try to use it like this, we run into a metaclass conflict. This last definition while using @dataclass for Card works, and infact has special support in the Enum library to created more structured enum values. It would be nice if this could work, no matter how our dataclass transform is implemented (be it decorator, base class or metaclass), but the current system heavily discourages metaclass usage because of these conflicts.

1 Like

The final enumeration is, sure. But you can have a base, abstract, mixin class to combine with an enum (or just make an AbstractEnumMeta).

The usual answer – when they want to make sure a subclass implements an otherwise useless method.

I just realized that the automatic metaclass merger function object needs to have a __prepare__ method itself, or the actual class body would get executed inside the namespace of a regular dict before getting passed to the __prepare__ method of the base metaclasses, by which time it’s too late for a custom namespace to be fed with the original class body.

So I’ve rewritten the automatic metaclass merger as a callable singleton instead. Also made the callable recursive so that it would resolve conflicts of metaclasses’ metaclasses if someone wants to use the code today. The original hope was that once this proposal gets accepted and becomes a built-in mechanism, there would be no need for recursion since any conflicting metaclasses of metaclasses would get merged first before the subsequent conflicting metaclasses are defined.

class CombineMeta:
    def __prepare__(self, name, bases, **kwargs):
        namespace = {}
        for metaclass in self._get_most_derived_metaclasses(bases):
            ns = metaclass.__prepare__(name, bases, **kwargs)
            if type(ns) in (dict, type(namespace)):
                namespace.update(ns)
            else:
                if type(namespace) is not dict:
                    raise TypeError('metaclass conflict: '
                                    'multiple custom namespaces defined.')
                ns.update(namespace)
                namespace = ns
        return namespace

    def __call__(self, name, bases, namespace, **kwargs):
        metaclasses = self._get_most_derived_metaclasses(bases)
        if len(metaclasses) > 1:
            merged_name = '__'.join(meta.__name__ for meta in metaclasses)
            ns = self.__prepare__(merged_name, metaclasses)
            metaclass = self(merged_name, tuple(metaclasses), ns, **kwargs)
        else:
            metaclass, = metaclasses or (type,)
        return metaclass(name, bases, namespace, **kwargs)

    @staticmethod
    def _get_most_derived_metaclasses(bases):
        metaclasses = []
        for metaclass in map(type, bases):
            if metaclass is not type:
                metaclasses = [other for other in metaclasses
                        if not issubclass(metaclass, other)]
                if not any(issubclass(other, metaclass)
                        for other in metaclasses):
                    metaclasses.append(metaclass)
        return metaclasses

combine_meta = CombineMeta()

Good example, allthough to be fair, this can be implemented without a metaclass since we now have __init_subclass__:

from enum import Enum
from dataclasses import dataclass
from typing import dataclass_transform, Literal

@dataclass_transform()
class SlotDataclass:
    def __init_subclass__(cls):
        cls.__slots__ = tuple(vars(cls).get('__annotations__', {}))
        dataclass(cls)

class Card(SlotDataclass):
    value_: Literal[2, 3, 4, 5, 6, 7, 8, 9, 10, 'J', 'Q', 'K', 'A']
    suite: Literal['heart', 'diamond', 'club', 'spade']

class PlayingCards(Card, Enum):
    TWO_OF_HEART = 2, 'heart'
    THREE_OF_SPADES = 3, 'spade'

so that:

print(list(PlayingCards))

outputs:

[PlayingCards(value_=2, suite='heart'), PlayingCards(value_=3, suite='spade')]

Demo here

Note that Enum members already have a value attribute so the value of a Card has to be named something else.

No, that doesn’t behave correctly. Settings __slots__ after the class was created is a nop. You can still set other attributes on the instances, and the attributes aren’t stored as slots, defeating the point of a slotted dataclass.

Ah, my bad. You’re absolutely right. You have a perfectly valid case then.

Well I only just noticed that since Python 3.11, there’s this slots option built-in to dataclass already so at least your specific use case has been taken care of:

from enum import Enum
from dataclasses import dataclass
from typing import Literal

@dataclass(slots=True)
class Card:
    value_: Literal[2, 3, 4, 5, 6, 7, 8, 9, 10, 'J', 'Q', 'K', 'A']
    suite: Literal['heart', 'diamond', 'club', 'spade']

class PlayingCards(Card, Enum):
    TWO_OF_HEART = 2, 'heart'
    THREE_OF_SPADES = 3, 'spade'

print(PlayingCards.__slots__) # outputs ('value_', 'suite')
PlayingCards.TWO_OF_HEART.value.name = 'foo' # raises AttributeError

However, it has to be said that there never was a good reason IMHO for dataclass to be a class decorator to begin with if not for fear of a metaclass conflict. People have to use awkward workarounds to get __init_subclass__ to see dataclass fields, for example, only because dataclass is a decorator and not a metaclass-driven class.

Bottom Line

Reducing metaclass conflicts with a built-in resolution/merging mechanism will encourage developers to make better design choices without fear of affecting user experience.

1 Like

Ofcourse, the @dataclass(slots=True) system doesn’t play well with other parts of metaprogramming:

class RegistryClass:
    registry = defaultdict(list)

    def __init_subclass__(cls, **kwargs):
        if (key := kwargs.pop('key', None)) is not None:
            cls.registry[key].append(cls)
        super().__init_subclass__(**kwargs)


@dataclass(slots=True)
class SlottedDataclass(RegistryClass, key="slotted"):
    a: int


print(RegistryClass.registry['slotted'][0] is SlottedDataclass)

results in the wrong SlottedDataclass object inside of RegistryClass.registery, since dataclass is creating a second one (and not passing over any kwargs, because the dataclass doesn’t have access to those).

1 Like

Yes absolutely. More reason to free developers from fear of using metaclasses.

By using the metaclass wrapper I posted in the Help forum, it is able to pass your test case since it calls __init_subclass__ to register the class object only after the dataclass decorator returns the new class.

As for passing on kwargs it’s certainly a problem as well, and I have to work around it by storing it in a _dataclass_kwargs attribute to pass it on:

_dataclass_keywords = set((code := dataclass.__code__).co_varnames[
    code.co_argcount:code.co_argcount + code.co_kwonlyargcount])

class DataclassMeta(type):
    def __new__(metacls, name, bases, namespace, **kwargs):
        class InitSubclassBlocker:
            def __init_subclass__(cls, **kwargs):
                pass
        dataclass_kwargs = namespace.get('_dataclass_kwargs') or {
            key: kwargs[key] for key in kwargs.keys() & _dataclass_keywords}
        configured_dataclass = dataclass(**dataclass_kwargs)
        cls = super().__new__(
            metacls, name, (InitSubclassBlocker,) + bases, namespace, **kwargs)
        if classcell := namespace.get('__classcell__'):
            cls.__classcell__ = classcell
        cls._dataclass_kwargs = dataclass_kwargs
        if 'slots' not in dataclass_kwargs or '__slots__' not in namespace:
            cls = configured_dataclass(cls)
        if hasattr(cls, '__classcell__'):
            del cls.__classcell__
        del InitSubclassBlocker.__init_subclass__
        super(cls, cls).__init_subclass__(**kwargs)
        return cls

and with the helper class Dataclass now inheriting from a base class that defines a __init_subclass__ to remove known dataclass keywords:

class DataclassBase:
    def __init_subclass__(cls, **kwargs):
        for key in kwargs.keys() & _dataclass_keywords:
            del kwargs[key]
        super().__init_subclass__(**kwargs)

@dataclass_transform()
class Dataclass(DataclassBase, metaclass=DataclassMeta):
    pass

Your test case woutld then output True:

class RegistryClass:
    registry = defaultdict(list)

    def __init_subclass__(cls, **kwargs):
        if (key := kwargs.pop('key', None)) is not None:
            cls.registry[key].append(cls)
        super().__init_subclass__(**kwargs)

class SlottedDataclass(RegistryClass, Dataclass, key="slotted", slots=True):
    a: int

print(RegistryClass.registry['slotted'][0] is SlottedDataclass) # outputs True

Demo here

And a full demo of your PlayingCards test case mixing RegistryClass, Dataclass and Enum with the proposed combine_meta here

class Card(RegistryClass, Dataclass, key="slotted", slots=True):
    value_: Literal[2, 3, 4, 5, 6, 7, 8, 9, 10, 'J', 'Q', 'K', 'A']
    suite: Literal['heart', 'diamond', 'club', 'spade']

class PlayingCards(Card, Enum, metaclass=combine_meta):
    TWO_OF_HEART = 2, 'heart'
    THREE_OF_SPADES = 3, 'spade'

print(list(PlayingCards)) # outputs [<PlayingCards.TWO_OF_HEART: value_=2, suite='heart'>, <PlayingCards.THREE_OF_SPADES: value_=3, suite='spade'>]

print(PlayingCards.__slots__) # outputs ('value_', 'suite')
print(PlayingCards.registry['slotted']) # outputs [<class '__main__.Card'>]

Things should just work like putting together building blocks. :grinning: