A massive PEP 649 update, with some major course corrections

Jelle · April 25, 2023, 4:09pm

The compiler could still simply use the literal 1, but user code could use inspect.AnnotationFormat.VALUE and get a nicer repr() and canonical name. IntEnums compare equal to the corresponding integer.

larry · April 25, 2023, 10:24pm

That sounds reasonable. Tell you what: I’ll experiment with it, and if it seems fine I’ll go with that, but I don’t want to commit to it until I’ve run the experiment. I always worry I’m overlooking something important.

larry · April 26, 2023, 8:33pm

I still don’t know who handles this–I’m working on other things–but it kind of makes intuitive sense that you can’t delete “members” defined in C. So maybe this is just stock behavior.

The important thing is that __annotate__ is always defined, so that it doesn’t get inherited, so we don’t have the same inheritance bug we had with __annotations__ on classes (that I fixed in 3.10). As long as it’s impossible to inherit __annotate__, I’m pretty easygoing about the specific semantics, and if the usual behavior for del o.__x__ when o is implemented in C and __x__ is a “member” is to set it to NULL internally (and None from Python’s perspective) that’s fine by me.

encukou · April 27, 2023, 8:19am

It uses T_OBJECT. Look at type, not flags.
And yes, that’s the old one – it’s the function object after all. Will you go for consistency, or learn from past mistakes? Doesn’t really matter much, I was just wondering what’s the difference between None and the internal initial value.

larry · April 27, 2023, 12:08pm

When you put it that way, “learn from past mistakes” sounds pretty good.

The internal initial value will be NULL, and… it’s not __locals__ anymore. At the sprints Wednesday the table I was at (with Jelle, Brandt, and Carl) realized that setting the “slow locals” on a frame that also had “fast locals” was going to lead to problems. Apparently there’s some debug-time automation where you can inspect the frame from inside a debugger and it back-propogates the “fast locals” into the “slow locals” dict? Anyway the approach seemed like it was getting too complicated and invasive for what was intended to be a small, easy, high-leverage change. Reusing locals in this way was elegant and cheap in 3.9 but in 3.12 it no longer seems wise.

The new approach: I’m adding a new __class_dict__ attribute to the function object (and not the “frame constructor”). __class_dict__ will be read-only, only settable via the constructor, or by a new opcode (keep reading). There will be a new LOAD_CLASS_DICT opcode that works identically to LOAD_NAME, except it uses the frame->f_funobj->func_class_dict dictionary instead of the frame->f_locals dict. This means we don’t need to make the frame any bigger (which will make Mark Shannon happy). The compiler will bind the class dict of a class to the __class_dict__ of __annotate__ functions defined on member functions of that class, using a SET_CLASS_DICT (intrinsic) bytecode, (as opposed to adding a new binary flag to the already-complicated MAKE_FUNCTION oparg). And when generating those functions, the compiler will write LOAD_CLASS_DICT instead of LOAD_NAME.

Also, Jelle already knows this as he reviewed the PR, but: I did make his IntEnum change more or less as he suggested it. Finally, I expect to add another special (intrinsic) bytecode to handle the boilerplate at the top of compiler-generated __annotate__ functions, reducing the weight of the boilerplate to two bytes, and likely speeding it up some to boot.

(Brandt described intrinsics as a “dumping ground”, which made them sound enormously appealing!)

Jelle · April 27, 2023, 12:24pm

Just a note that on my PR (gh-103763: Implement PEP 695 by JelleZijlstra · Pull Request #103764 · python/cpython · GitHub) I decided to rename the opcode to LOAD_CLASS_OR_GLOBAL, as LOAD_CLASS_DICT sounds to me like we’re loading the class dictionary, while the intended meaning is “look in the class dict first, then the globals”. I am open to bikeshedding on the opcode name, though.

(For context, the implementation of PEP 695 has to solve a very similar problem to Larry’s PEP 649, so we’re planning to merge the underlying mechanism Larry is proposing into 3.12 for these PEP 695 use cases, and hopefully use it again for PEP 649 in 3.13.)

h-vetinari · April 27, 2023, 12:47pm

Has there already been an official pronouncement on the fate of PEP 649 vis-à-vis Python 3.12?

Jelle · April 27, 2023, 12:49pm

It’s very unlikely to make it into 3.12 at this point, but very likely to make it into 3.13 (possibly behind a __future__ import). The SC hasn’t made a final decision yet.

larry · April 27, 2023, 4:52pm

Like I say in person: I take my marching orders from the SC. Whatever they say, goes. If they say they want it in 3.12, by golly I’ll try. If they say they want it in 3.13, I’ll get to relax. If they say they want a future gate, I’ll put one in. (And if they reject the PEP, well, so it shall be.)

larry · April 27, 2023, 4:59pm

I used “class dict” consistently everywhere; the field is __class_dict__, the intrinsic is SET_CLASS_DICT, the opcode is LOAD_CLASS_DICT. I thought the consistency would be of value.

I could see my way to LOAD_CLASS_DICT_OR_GLOBAL. It’s true, the opcode will look in globals. But

that’s getting kinda long, and
LOAD_NAME doesn’t mention where it’s looking,

so that doesn’t seem strictly necessary.

(Then again, maybe the name LOAD_NAME is an anachronism, from the days when it really was the only way to look up a name or something. Hmm.)

Jelle · April 27, 2023, 5:10pm

By sheer coincidence, I just sent you an email suggesting the same name. (Though I put CLASSDICT together, which I think makes it slightly easier to parse: is it “load class, dict, or global” or “load class dict or global”?)

encukou · April 28, 2023, 10:05am

Oh, interesting. But what is __class_dict__? When does it get set? The final dict is made as a class is finalized, but it might be needed before. For an evil example:

from collections import defaultdict
import inspect

class ZeroMeta(type):
    def __prepare__(cls, bases):
        return defaultdict(int, inspect=inspect, print=print)
        
class C(metaclass=ZeroMeta):
    def foo(self, arg: naught = nil):
        return arg
    
    # What is foo.__class_dict__ here?

    print(inspect.get_annotations(foo))

And will the final __class_dict__ be a mappingproxy to protect it from direct modification (bypassing the method cache)?

steve.dower · April 28, 2023, 10:20am

Would we be better off calling it an “annotations scope dict” rather than “class dict”?

Specifically, I’m thinking of just naming it in a way that makes it specific to handling annotations, and avoid the potential distraction of needing to have it there, not being able to change the behaviour in the future, or people deciding it was intended for other purposes.

Presumably if a function has no annotations, there’s no need for this dict? And we’ll know that at compile time.

larry · April 28, 2023, 4:23pm

If the annotations are on a function defined inside a class, we may need the class’s dict handy to resolve them. So this is that dict, the __dict__ of the class, when we’re generating an __annotate__ for a function defined in that class’s scope.

Not being snarky–sure, let’s go ahead and bikeshed the name. Jelle already wanted to remove my underscore; I was calling it class_dict everywhere (__class_dict__, LOAD_CLASS_DICT, etc). He said it was clearer without the underscore, which I didn’t agree with. But I’m not especially in love with the name.

We’ll have a dunder attribute on the function object, and a LOAD_ opcode to load from that (or globals), and a SET_ intrinsic to set it.

I’m not passing judgement on “annotation scope dict” yet, except that I note it’s pretty long, and I don’t look forward to typing __annotation_scope_dict__ and LOAD_ANNOTATION_SCOPE_DICT eight million times as I work on the implementation of 649. Maybe you could come up with something shorter?

Also, I observe that apart from the historic LOAD_NAME, the LOAD_ instructions are consistent about describing the thing being loaded. LOAD_BUILD_CLASS, LOAD_ASSERTION_ERROR. Some incidentally also describe the place it’s being loaded from, LOAD_GLOBAL, LOAD_FAST, but again I think these are more describing the things themselves (a global, a fast local) than the place they’re coming from (globals, fast locals). LOAD_DEREF seems to be describing the unique mechanism itself, I don’t think anyone describes those as “deref variables” or a cell/closure as a “deref”.

Anyway, long story short, maybe the instruction should describe the thing we’re loading, and then maybe that will inform us about what to call the storage on the function object for storing that place. Please keep your answers to less than 16 characters, shorter is definitely better. Discuss.

Yes, exactly. If there are no annotations, we won’t need any of this.

Alas, from what I remember of two years ago, beyond that I can only be so smart. My recollection is: I decide whether or not I need this dict based on whether or not I emit a LOAD_NAME opcode for any of the annotation expressions (which, again, we will now change to LOAD_WHATEVER). Unfortunately as I recall the compiler generated LOAD_NAME in places where it was like, *shrug* maybe it’s a global I dunno. So I had more LOAD_NAME instructions than I expected, and I was keeping a reference to the class dict for more __annotate__ functions than I thought should have been necessary.

(Maybe I can do a better job this time around. I already know I had scope bugs in the old implementation.)

steve.dower · May 2, 2023, 9:20am

I’d drop dict immediately - scope adequately implies the type. And agreed it’s better with the underscore.

I’m not sure how we shorten annotation though, without just changing it to hint (or maybe inspect?). And maybe it’s okay for it to be a long name (after you’ve finished typing it all, that is ). Basically nobody should be touching it directly anyway, and long names are one way to gently discourage it.^[1]

Chances are class_dict and CLASS_DICT are safe enough to do a global find/replace anyway. I only see a few existing instances (in typeobject.c). Or if you’re starting again, ann_scope doesn’t seem to be used anywhere, so you could start with that and replace it all at the end?

And to be clear, I don’t want to bikeshed this just because the name is prettier. I’m genuinely concerned that people will see class_dict in the future and think that it’s a trustworthy way to access the original scope it was defined in. Best to avoid those bugs/complaints by not making it suggest that.

For example, you’re currently feeling discouraged from typing the long name ↩︎

larry · May 2, 2023, 9:54am

Drive-by comment: PEP 649 concerns itself with annotations. Type hints are the most popular type of value to set as annotations, but “annotations” and “type hints” are hardly the same thing.

What I was saying all PyCon long: "Annotations are like parameters, and type hints are like arguments."

I was thinking about LOAD_CLASS_VAR or LOAD_CLASS_ATTR, but seeing as how you suggest we avoid the name class_dict, I guess you’re not gonna like those.

All I have left at this point is–go completely another direction with it. How about LOAD_EXTRA_OR_GLOBAL, LOAD_EXTRA_OR_CLOSURE, and extra_dict?

steve.dower · May 2, 2023, 10:02am

Yeah, I’m aware. I also feel like I’ve been losing that battle (not here, but Out In The World), and don’t really mind conceding it if it saves having to clarify every time it comes up.

I have no idea what “extra” means, so it passes the test for “would I use this without reading the docs first”.

Though don’t we already have a co_extra field? Or type or something? I forget what it’s for, in any case.

What about LOAD_DEF_CLOSURE - meaning the closure captured as part of def. I like that “closure” implies that only the necessary names/values are in there.

kknechtel · May 6, 2023, 12:53am

Just a quick question, do I get acknowledgement for the naming?

larry · May 6, 2023, 1:35am

Absolutely! Didn’t I acknowledge you here in the thread?

kknechtel · May 6, 2023, 1:46am

In the text of the PEP, I meant (Acknowledgments section). But I don’t know if there are standard policies for this or anything.

Topic		Replies	Views
Two polls on how to revise PEP 649 PEPs	60	3860	February 22, 2023
PEP649 means that PEP563 will see more usage, not less, and will break runtime typecheckers Typing	6	984	March 8, 2024
PEP 683: "Immortal Objects, Using a Fixed Refcount" (round 4, last call) PEPs	35	6468	October 5, 2023
PEP 630 -- Isolating Extension Modules PEPs	1	744	January 3, 2021
Python 3.8.0a1 postponed for a week Committers	10	2168	January 25, 2019

A massive PEP 649 update, with some major course corrections

Related Topics