Make name lookups follow MRO within a class definition

blhsing · March 15, 2024, 2:58am

Motivation

In Python, attribute lookups of both a class and an instance follow MRO, a behavior that is widely considered both DRY and intuitive:

class Base:
    foo = 'bar'

class Child(Base):
    pass

print(Child.foo) # outputs bar
print(Child().foo) # outputs bar

And yet, within the definition of the child class, it often annoys/surprises me and many others that there is not an equivalent name resolution order:

class Base:
    foo = 'bar'

class Child(Base):
    bar = foo # raises NameError

The typical workaround is to get the attribute explicitly from the base class that has the attribute:

class Base:
    foo = 'bar'

class Child(Base):
    bar = Base.foo # bar = 'bar'

But it just feels inelegant in terms of DRY having to repeat the name of the base class everywhere in the subclass where an attribute of the base class is needed.

Many Python users also find the behavior counter-intuitive and the workaround inelegant, so here are my proposals to improve ergonomics and maintainability of definitions of child classes:

Proposal #1

Make type.__prepare__ return a custom namespace that would default to a ChainMap of __dict__s of base classes when a given name is not found in the current dict. The behavior is enabled only when a new given keyword argument inherit_namespace is true.

So the logics of the new type would be similar to:

from collections import ChainMap

class InheritedNamespace(dict):
    def __init__(self, bases):
        self.chainmap = ChainMap(*map(vars, bases))

    def __getitem__(self, name):
        try:
            return super().__getitem__(name)
        except KeyError:
            return self.chainmap[name]

class NewType(type):
    @classmethod
    def __prepare__(metacls, name, bases, inherit_namespace=False):
        if inherit_namespace:
            return InheritedNamespace(bases)
        return {}

    def __new__(cls, name, bases, namespace, inherit_namespace=False):
        return super().__new__(cls, name, bases, namespace)

so that:

class Base(metaclass=NewType):
    foo = 'bar'

class Child(Base, inherit_namespace=True):
    bar = foo

print(Child.bar) # outputs bar

Currently IDEs and type checkers would complain about foo being undefined when referenced in the body of Child in the above code, but should no longer do so once the change is made official.

Backward Compatibility

While the existing code will continue to work since the new behavior is opted in, it should be documented that since this new feature is implemented with a non-dict namespace class InheritedNamespace, libraries that use metaclasses with custom namespaces are advised to make this new built-in namespace class a base class for their custom namespace class when inherit_namespace is true so that their metaclasses are mergeable with other metaclasses.

Performance Impact

There should not be any performance impact to existing code as name lookups continue to be performed on a regular dict. When inherit_namespace is true, there will be a small performace impact to a failing name lookup to produce an exception.

But WAIT!

On second thought, what meaningful downside could there possibly be by making an MRO-based fallback the default behavior for all name lookups within the definitions of all classes? Names that fail to be looked up in the class definition shouldn’t be in any of the existing code base, so applying the proposed behavior to all class definitions only enhances user experiences with what couldn’t be done before.

So here is an even better, less disruptive…

Proposal #2

Make a wrapper namespace over the namespace prepared by the metaclass (which defaults to an empty dict prepared by type). The wrapper namespace would delegate all accesses to the actual namespace, but would fall back to a ChainMap of __dict__s of the base classes should a name not be found in the actual namespace:

from collections import ChainMap

class InheritedNamespace(dict):
    def __init__(self, bases, namespace):
        self.chainmap = ChainMap(*map(vars, bases))
        self.namespace = namespace

    def __getitem__(self, name):
        try:
            return self.namespace[name]
        except KeyError:
            return self.chainmap[name]

    def __setitem__(self, name, value):
        self.namespace[name] = value

And re-implement __build_class__ such that it wraps the namespace with the wrapper above for the execution of the class body. The original namespace instance prepared by the metaclass is then passed to the metaclass to build the actual class, with the new process of namespace buildup completely transparent to the metaclass.

A reference implementation with no error handling for brevity:

import builtins
from types import resolve_bases, _calculate_meta

def build_class(func, name, *bases, metaclass=type, **kwargs):
    metaclass = _calculate_meta(metaclass, bases)
    namespace = metaclass.__prepare__(name, bases, **kwargs)
    if (resolved_bases := resolve_bases(bases)) is not bases:
        namespace['__orig_bases__'] = bases
    exec(func.__code__, globals(), InheritedNamespace(resolved_bases, namespace))
    return metaclass(name, resolved_bases, namespace, **kwargs)

builtins.__build_class__ = build_class

so that:

class Base:
    foo = 'bar'

class Child(Base):
    bar = foo

print(Child.bar) # outputs bar

Demo here

Backward Compatibility

Unlike Proposal #1, this proposal completely eliminates the need for library maintainers to refactor their custom namespaces to account for a new built-in custom namespace.

However, existing code with the following (presumably unlikely) pattern where the same name is used both in the base class and outside of it may break and would have to be refactored, or the types module can offer an implementation of __build_class__ that behaves in the old way to allow easy override and to ease the transition:

foo = 'baz'

class Base:
    foo = 'bar'

class Child(Base):
    bar = foo # bar is currently 'baz', and would become 'bar' with the proposal

Performance Impact

There will be a miniscule performance impact to name lookups during the execution of a class body due to the namespace wrapper’s delegation, but again, the impact should be minimal.

alicederyn · March 15, 2024, 7:13am

Actually, I think my mistake was misunderstanding what Default was in the original example I was writing. I assumed (from capitalization conventions) that it was a class.

blhsing · March 15, 2024, 7:44am

Ah, you mean that you thought Default is defined as a class outside EnumParam? I misunderstood you then, and I may have misunderstood the OP of that topic as well.

Nevertheless, for illustration purposes, let’s pretend you did mean to say Default is supposedly inherited from the base class EnumParam. Thanks!

blhsing · March 15, 2024, 7:47am

I’ve updated my original post with what I think is an even better and less disruptive proposal. If you’ve read my first proposal, please help review and consider the second.

storchaka · March 15, 2024, 9:15am

This feature looks specific to EnumParam, so why not implement it in EnumParam? You could immediately use it and not wait several years until the Python version with such feature became dominant. You can also tune the feature to your need and quickly fix any possible issues.

blhsing · March 15, 2024, 9:24am

No, this feature is not at all specific to EnumParam.

In fact, it has been a highly demanded feature from many Python users who also find having to explicitly reference the base class for a supposedly inherited name in the body of a child class counter-intuitive.

List of StackOverflow questions demanding this very feature, many of them highly voted:

I’m going to update my original post with links to these SO questions instead now that I realized I misunderstood @alicederyn.

Daverball · March 15, 2024, 10:11am

This seems like a really bad idea to be honest and will probably cause some subtle and confusing bugs. The scope should never be invisible to you, if you go to a scope that’s potentially outside your current file it should be made explicit (self, super(), import, …).

Inside a class body something as dynamic as super() doesn’t really make that much sense because the lookup happens the moment the class body is evaluated, so it is static for all intents and purposes, at best it would save you the trouble of having to remember which base class the attribute was defined on in a multiple inheritance scenario, but I also think it’s dangerous, since it fools people into thinking super() at class level does the same thing as it does at instance level.

The only real way to defer that lookup, so dependency injection works correctly, is to use a classmethod, you could maybe generalize it to a few additional scenarios using a custom descriptor, but I don’t think you could make it behave sanely in all circumstances. So I think making it more cumbersome is actually a good thing here, since you have to think about these things.

blhsing · March 15, 2024, 10:17am

If users don’t find “invisible” attribute foo of Child confusing due to understanding of inheritance:

class Base:
    foo = 'bar'

class Child(Base):
    pass

print(Child.foo) # resolves even though Child does not have foo

Why should they find the “invisible” name foo from inheritance confusing?

class Base:
    foo = 'bar'

class Child(Base):
    bar = foo

As usual some may find the proposed data model “confusing” at first only because this has not been the design since day 1. But as the SO questions show, this design would’ve been intuitive to many.

Daverball · March 15, 2024, 10:21am

Because they’re not the same thing, foo doesn’t actually exist on Child, it exists on Base, it’s just that when you make an attribute lookup it will take the MRO into account and also visit Child. Name lookups inside the class body should not be magically different. You can make an argument for requiring something equivalent to self. or super(). inside the class body, but not for a magical scope extension with no hint to the user.

But either one is probably a bad idea for the reasons I outlined.

blhsing · March 15, 2024, 10:24am

They are the same thing. foo only exists in Base in both cases above. So it’s only consistent to resolve the same foo on the Child both in the class body and on the class object of Child.

pf_moore · March 15, 2024, 10:29am

You’re missing the point. The fact that the class rererence is written <something>.foo signals that it’s looked up at runtime via <something>, and therefore that the runtime attribute lookup rules apply. But for assignment to a bare name, foo = <value>, the rules for that are Python’s normal scoping rules that most importantly are lexical, so they depend only on the names in the lexical context of the assignment, not on runtime state like the MRO.

Lexical scoping for assignments is fundamental to Python’s design, and isn’t going to change (nor should it, IMO).

blhsing · March 15, 2024, 3:20pm

With all due respect, nothing outside of the laws of physics and things that are mathematically proven is so fundamental to anything that can’t be changed as long as there are good reasons for the change.

Lexical rules and data model are only designed to be a means to an end. And that end is user experience. I love Python because I feel that it’s a programming language that strives the most to offer the best user experience. And by a good user experience I mean one that follows the Principle of Least Astonishment, i.e. “People are part of the system. The design should match the user’s experience, expectations, and mental models.” Python is often regarded as the best first programming language to learn precisely because its design comparatively adheres the most to this principle.

But no programming language is perfectly designed from day 1, which is why I appreciate a developer community like this that listens to user feedbacks and evolves the language for the better.

Now back to the topic. I believe that the fundamental scoping rule of Python does not match the user’s experience, expectations and mental models when it comes to a class. Currently Python’s scoping rule treats a class scope in the same way as a function scope and a module scope, when the user’s expectation of a class comes from the common OOD paradigm, where a subclass should by default inherit all properties from its base class (with no visibility issue in Python), while a function and a module do not, which is why Python has been designed to meet the expectation that one can access an attribute of the base class from a child class object in the first place:

class Base:
    foo = 'bar'

class Child(Base):
    pass

print(Child.foo) # outputs bar

Now, tell me, from a user experience’s point of view, just why shouldn’t one expect to access an attribute of the base class from a child class definition? Why shouldn’t the scope of a class include that of its base class?

class Base:
    foo = 'bar'

class Child(Base):
    bar = foo # foo is naturally expected from inheritance

And this works in Ruby. Why can’t Python?

class Base
	@@foo = 'bar'
end

class Child < Base
	@@bar = @@foo
end

puts Child.class_variable_get(:@@bar) # outputs bar

I’ve already demonstrated that with a bit of tweaks to the current rules and implementation it can work in Python too. So why do you have to treat the current lexical rules and data model as an end, as a reason not to pursue a better user experience? Rules can evolve, and implementations can be tweaked, as long as most importantly it improves user experience at the end of the day.

Since the numerous SO posts I’ve linked to have already shown what the users’ expectation is on this issue, how rules can evolve and how implementations can be tweaked to meet the expectation is what I believe we should be discussing about here.

davidism · March 15, 2024, 3:48pm

Ben, with all due respect, you need to slow down. You are all over the place, suggesting a huge change in one place, then moving on to a completely different area and suggesting another huge change, then again, and again. You have 13 different idea topics. In all of them, you strongly insist that the change is essential, ignoring feedback. But you never follow through with any of them. If you think these things are essential, start focusing on getting one through the process rather than creating more. Actually take the time to find a core dev sponsor, write a PEP, submit it to the steering council, and then implement it if it’s accepted. PEP 1 – PEP Purpose and Guidelines | peps.python.org Otherwise, you’re wasting core dev’s and community’s time and attention by continually making them evaluate huge changes then insisting they’re wrong when they tell you they don’t like the idea. It’s exhausting. You’ll notice that no core devs are really interacting positively with your ideas anymore, which should be an indicator to you that maybe you should reevaluate your approach, since they’re the ones you need to convince.

blhsing · March 15, 2024, 3:52pm

I strongly disagree with that claim. I only insist that the change as essential in this topic and the metaclass merger one. And I don’t ignore feedbacks.

But you’re right that I should start submitting PEPs to ones that saw enough support. Thanks.

davidism · March 15, 2024, 3:57pm

You can strongly disagree with it, but it doesn’t change the fact that regardless of your intentions you’ve left me and others with that impression during your time interacting on these forums. So you can either ignore the feedback, or try to reflect on why we might have that impression and how you can change it.

blhsing · March 15, 2024, 4:09pm

With all due respect, you appear to have a tinted filter that is hard-wired to ignore all my recent responses to feedbacks from other users, all positive ones:

Why not judge a proposal by its merit instead?

Rosuav · March 15, 2024, 4:10pm

No, you should first find a core dev who would sponsor a PEP. I’ve only been skimming this thread, and as @davidism says, this is all over the place and making huge demands. For this to become a serious proposal, you first have to convince at least one core dev (or other valid sponsor) that this is worth going ahead with. Without even a single core dev interested enough, your proposal is just yelling into an empty jar - you can get a lovely echo out of it, but you’re not convincing anyone of anything, and certainly not getting a language change.

Alternatively, fork Python, make your change, and use it for your own personal use. If you really think that this is so much better, you’re welcome to make your very own personal build of the language that behaves the way you want it io.

davidism · March 15, 2024, 4:17pm

Setting slow mode, for people to take more time to consider the proposal and previous responses before adding their own.

Feedback has been provided to the user about their interactions, but we don’t need to keep hammering it anymore right now. Don’t continue discussion of that feedback here anymore. If this thread (or others) is getting off topic or unproductive in the future, please flag it.

Gouvernathor · March 16, 2024, 5:32pm

Back to the main topic of the base proposition, I think this would be a bad idea because I had to wrestle with it some time ago and the proposed “fix” would have prevented me from solving my problem.

I was tweaking some code which defined subclasses of random.Random, and which wrapped its methods. Of course, the file itself imported the random module. One of the method-wrapping lines looked something like this.

# this is inside the class block
randrange = thewrapper(random.randrange)
sample = thewrapper(random.sample)
random = thewrapper(random.random)

And I needed to add the new binomialvariate method (it wasn’t that one but it works for the example). So I added its line, giving the following.

# still inside the class block
randrange = thewrapper(random.randrange)
sample = thewrapper(random.sample)
random = thewrapper(random.random)
binomialvariate = thewrapper(random.binomialvariate)

And it failed. Did you guess why ? Because random was defined as a method just above, so it wasn’t the random module anymore. The solution to that bug was to reorder the random = to the bottom.

All this to say, existing code relies on the variables defined outside the class block being readable inside it, even if they have the same name as inherited members or methods. What if the variables defined outside the block were to take precedence over your new magic scope extension, you may ask ? (that may be what you’re proposing with ChainMaps)

In that case, a given variable name may hold three different meanings inside the class block, with the following order of precedence (unless you want to break compatibility) :

a method or member you just defined inside the class block
if not, a value defined outside the class block
if not, a value found in one of the parent classes.

import random

from ... import RandomSubclass # subclass of random.Random

class MyRandom(RandomSubclass):
    randrange = wrapper(randrange)
    # randrange here would be scoped from the parents and be RandomSubclass's

    # Suppose I want to skip RandomSubclass.sample and directly wrap the native one.
    sample = wrapper(random.sample)
    # random here is scoped outside the class body because it exists there.

    random = wrapper(RandomSubclass.random)
    # needs to be accessed that way to avoid collisions with the module defined outside

    # ...and now "random" evaluates to the method !

That’s way too magic in my opinion, and in the rare cases where it would be used, would probably cause more confusion and mistakes than the examples you gave.

There are things like that in Python which you can consider bad choices, a similar example is the fact that default values for function parameters are scoped statically rather than dynamically which I find to be a terrible design in the language as it is today, but when you take compatibility into account it just becomes impossible to change.

Also:

That’s not true, if by that you mean the scope when writing variables in a class block, in a def block and in a module file respectively. All three have major differences, and even though the function and class scopes look alike with separate local and global scopes, one interesting difference is that the function exists when its code gets executed, whereas a class doesn’t exist yet when its code gets executed. That makes implementation of your proposition very tricky if not impossible, when taking into account the MRO rules which the metaclass can meddle with.

Lastly, I think there could be value in allowing super() inside a class block to generate a sort of sentinel object, sort of like enum.auto(). super().randrange for example would hold a sentinel value which, once the class block is fully executed and the class is getting built, is resolved like Ben would like it to and replaced by (here) RandomSubclass.randrange. But the usability of such a feature would have to be weighed.

alicederyn · March 16, 2024, 6:11pm

If I write a symbol in my file in Python, I know exactly what it resolves to. Furthermore, I know it will always resolve to that, unless I rewrite my file.

If superclasses become silently linked in to the lookup, I could unexpectedly pick up any new name defined in an arbitrary file owned by a completely different author. Including a private name I have no business accessing and had no intention of accessing!

Other languages which have this feature also have a whole combination of features that work to offset its downsides, like true private names. (And they also still have this problem.)

Topic		Replies	Views
Annotating a variable without specifying its type Ideas	21	918	March 21, 2024
Where do you annotate a long type of a class attribute? Python Help typing	3	455	July 15, 2023
Make super() work in a class definition Ideas	17	2079	March 23, 2024
New addition: typing.CustomType Ideas	28	1811	January 19, 2022
Why doesn't Python raise a TypeError when type annotations are absent in dataclass subclasses?" Python Help	4	786	July 6, 2023

Related Topics