Explicit Deferred References Alternative to peps 649 and 563

The idea is simply to create a new type called DefRef that produces classes that represent at run time deferred references. Combine the new type with two context managers to produce the classes results in a small, mostly back portable, api that allows for static and runtime consistent evaluation.

DefRef is a simple class that tracks how it was created and produces more DefRefs upon attribute access. DefRef classes would be interpreted as themselves everywhere in python. Static and runtime type checkers, or even the get_type_hints function itself, can then resolve the DefRef to the implied type in annotations. This allows them to be easily used not only in CPython but in other implementations with consistent and easy to comprehend results. No weird depending on when you look at it side effects. Just a consistent real object.

Simple example

# DefRef is a metaclass
a = DefRef("a",(),{})
class a:

class b:

a # class a
a. __annotations__["var"] # DefRef ( name="a")
b. __annotations__["var"] # class a

a is replaced in the module immediately but class a’s annotation does not get updated this allows for easy to understand behavior. Further it means pythons normal scoping rules can be used and you will not have to mentally keep one set of scopes for annotations vs for normal python code.

By using context managers to control the creation of the DefRefs it allows the use of python default scoping rules while produce types that static type checkers can explicitly understand.

The first context manager (and likely main one). Is defer_imports This temporarily hijacks the import statement to produce DefRefs instead of actually importing the module.

from defref import defer_imports

with defer_imports(globals()):
    from json.encoder import JSONEncoder as jsonenc
    import json

json is not the module json but instead a DefRef that can be used in annotations. Unlike pep 649 this will actually resolve the large number of circular reference issue that currently exist. Further it preservers the import information to allow it to be used at runtime.

The second context manager is for representing in module DefRefs. Currently I am calling it a deferral window context. The function is defer_window.

with defer_window(globals(),locals()) as defer:
    a,b = defer("a","b")

The ideally if a and b are NOT defined before exit is called it will raise an exception. The DefRef’s a and b would resolve to whatever non DefRef object occupies the local scoops a and b when exit is called.

with defer_window(globals(),locals()) as defer:
    a, = defer("a")

    class c:...

    class d:

    a = c

d. __annotations__["var"] # DefRef ( name="a", resolved_to class c)  

This allows for predictable results that do not depend on when inspect or get_type_hints is called a feature both pep 649 and 563 cannot do. It also works with locally renamed variables.

A full example of code that would run and give deterministic results would be as follows.

from defref import defer_imports, defer_window

with defer_imports(globals()):
    from json.encoder import JSONEncoder as jsonenc
    import json

with defer_window(globals(),locals()) as defer:
    a,b,c = defer("a","b","c")

    class a:

    class b:

    class c:

This is currently a WIP. Currently working: import hijacking, scoping rules, context managers, and resolving non import deferred references. Import hijacking relies on hackily just replacing the __import__ builtin and then swap it back after. I am aware that is documented to not be done I am fairly confident any issues with that can be overcome though it was simply the easiest way to wholesale replace import. I would also like to add some form of caching for the modules so that it does not end up create a bunch of extra objects that are unneeded and to prevent unnecessary delays in resolving defer references. Probably also beneficial to add some way to distinguish between deferred modules and classes.

One other items I would like to add that would, IMO, greatly assist runtime type checking is the ability to request a callback once a type is resolved. This would prevent the need to constantly fire functions to blindly see if something is resolvable later on. Currently, that is what is done to resolve the problem right now. DefRef’s are intended to more closely resemble weakrefs then forward refs.

There is another benefit to them that I didn’t discuss yet, in restoring normal scoping rules in function imports will once again be a tool to prevent unnecessary imports. pep 649 does not resolve problem.

with defer_imports(globals()):
    import expensive_module as em

def rarefunction(a:int) -> em.internal # DefRef( expensive_module.internal)
    import expensive_module as em
    em.internal # not a DefRef

This is also adoptable alongside pep 649 though I think that would realistically make it far harder to understand the code. The code currently is around 150 lines of python without docstrings. It probably needs another 200 lines or so to cover the remaining features. It requires 0 modification to the interpreter and should be nearly fully backportable which would really help adoption in codebases targeting older python versions. For a non CPython implementation depending on what can be done with __import__ there would be between 0 and 2 methods that need to be replace both are isolated to a single context manager. Making it by far the most compatible solution.

This does however have more runtime overhead as it goes back to using real objects instead of well not using real objects. I am really not sure if I have a good grasp of how much that is actually an issue. I would imagine in the most time sensitive code bases, command line programs, the benefit of not importing far outweighs the cost of creating the extra objects.

1 Like

Hi Zachary,

Some much-belated thoughts:

I think the most basic reason this doesn’t really work as a replacement for PEP 563/649 is that it misses one of their key motivations, which is to reduce/eliminate the import-time and memory overhead of annotations in large codebases that are heavily annotated but don’t introspect most of the annotations at runtime. This cost can be very significant, especially with more complex annotations that aren’t just a single name, but have to resolve a series of subscriptings etc.

I still think the idea is interesting as something that could compose with PEP 649. Specifically, it looks like the upcoming updated draft of 649 will include an enhanced ForwardRef that is pretty similar in concept to your DefRef. If someone asks to introspect PEP 649 annotations in a lenient “hybrid” mode, nonexistent names will implicitly be replaced by ForwardRef objects. I think it’s good to have this option available so that e.g. cyclic dataclasses can work out of the box with minimal boilerplate. But if you are writing a library that introspects annotations, there’s no reason you can’t choose to introspect annotations in the default mode (where all names must exist), and ask your users to instead explicitly create ForwardRef objects in advance as needed (including possibly by using a context manager like defer_imports() instead of putting imports under if TYPE_CHECKING:). This approach can coexist nicely in the same ecosystem, since in the end the ForwardRef objects will look and behave the same, whether created explicitly and manually or implicitly via hybrid-mode introspection.

I think there might need to be a clearer demonstration of widespread interest in this explicit approach (perhaps by adoption of a third-party library, since AFAICT everything you’ve suggested is doable as a third-party library) before it would make sense to include this in the standard library.

1 Like

Carl: upon reading your reply here, it occured to me that maybe my proposed inspect.get_annotations should sprout a fourth format. So far I’ve proposed three, “values”, “hybrid”, and “strings”. The fourth would be “ForwardRefs”. I expect the “strings” implementation will internally produce ForwardRefs anyway; it’ll calculate the annotation using a ForwardRef, then extract the string and use that as the annotation value. So this fourth format will add only a negligible amount of code. This is much better than the user trying to turn the string back into a ForwardRef themselves, as they may not have the proper globals / locals / closure handy.

My assumption about the initial three formats (which might not match what you were thinking?) was that “strings” format produces strings for everything, and “hybrid” produces real objects where the names are found and ForwardRefs for names not found. (It occurs to me here that perhaps hybrid is not a great name for that mode either, since it’s not really a hybrid or mix between strings and values at all, it’s its own thing entirely. Maybe forwardrefs or lenientvalues is a better name for it?)

The new fourth mode you are suggesting: how is it different from “hybrid”? Does it produce ForwardRefs for all names, whether or not they are found? Or were you thinking of “hybrid” mode as producing a mix of real values and strings, not a mix of real values and ForwardRefs?

Personally, I’m not aware of use cases for an “everything is always ForwardRef” mode, nor for a “mix of real values and strings” mode. So I would suggest that the only three modes we need are “all real values, else NameError”, “mix of real values and ForwardRef”, and “all strings.” The first two modes serve the “want real values” use cases, giving you the option of either being strict or allowing ForwardRef as needed. And the strings mode serves the “annotations as documentation” use cases.


No, “hybrid” mode would produce real values and ForwardRefs.

I was thinking this fourth mode might be useful for somebody–I admit I don’t have a concrete use case–and we get it essentially for free. In order to produce the strings for “strings” mode, we have to first produce a ForwardRef, then extract the final string. It’s easy to sidestep this final extraction and emit the ForwardRef instead.

1 Like

Makes sense. I guess the only cost of providing this fourth mode that I can think of is the potential that it might confuse/attract a user who really would be better served by one of the other three. So I guess I’d be slightly inclined to just start with the three we know are useful? If someone comes along later with a real need for the all-forwardrefs option, there wouldn’t be any back-compat issue with adding it later. But I also don’t have any problem with just offering it from the start, since it does fall out so naturally from the implementation.

Another theoretical downside: the implementation might change in the future, in such a way that this feature was no longer almost-free. So now we have a feature with negligible benefit and some unknown, hypothetically unbounded cost.

But this doesn’t seem likely to me. If the implementation supports both “strings” and “hybrid” mode, that means the implementation has to be able to produce both strings representing entire annotation expressions and ForwardRef objects, so it should always be easy to produce ForwardRef objects containing entire annotation expressions. I’m having difficulty imagining an implementation where this fourth format brought with it an appreciable additional cost.

I’ll continue to ponder it though.


Hey Carl thanks for the response. I do have some questions, I think one thing that is hard to understand in general is the actual overhead of annotations themselves and what is causing it. I haven’t (maybe there is floating around) seen any benchmarks for the proposals. I think it may be helpful to autogenerate a large “code” base that can serve as a benchmark tool to measure performance on. Can you share any info on how the ones you have seen that have issues are layed out. Is it mostly fully inline annotations of types or are they using a mixture of non imported files that have mostly Prototypes that are used?

I went into this under the impression that the bulk of the issue of memory and runtime was from the imports themselves, along with the objects it pulled in, not from the annotation evaluation themselves.

If the primary issue is that annotations subscriptions are expensive, would it make sense to do something along the lines of memonization of the results? I know that at the single call level they are memoed. But if you memo the entire annotation as a single object that would seem to resolve this issue more effectively. While it would technically be a “breaking” change to say that it is expected that repeated annotations of the same underlying object evaluate to the same object each time, I don’t think it would exactly be remotely outside the current expected behavior.

Ultra rough example:

class Annotation:
    def __init__(self, *args):
        self.ann = args

    def resolve(self):
        # excute annotation 

resolvedannotations = {}

def MakeAnnotation(*args):
        return resolvedannotations[args]
    except KeyError:
        ann = Annotation(args)
        resolvedannotations[ref(args)] = ann
        return ann

Then during bytecode compilation all complex annotations get turned into MakeAnnotation function calls. You could pass in whatever u need to full reconstruct the function calls, I presume something like a string marker for each operation plus the real objects themselves. (obviously in a real use to be done in c)

The concern I have with lazy annotation evaluation proposed by PEP 649 is lazy binding of names. I don’t think it is an issue to lazily evaluate annotation subscriptions or function calls.

If there is a serious performance issue for annotations with many subscriptions I don’t think it makes sense to only fix the issue for when the annotations are not evaluated.

@larry Would it be terribly complex to do some kind of automatic name mangling so that the original object references could be preserved via coannotations but the annotation itself doesnt need to need to evaluate the bytecode? I imagine that would sacrifice some speed as you would need to set the names somewhere but from what I am gathering here that wouldn’t be nearly as expensive as evaluate the whole object outright.

I did a pretty simple version of this for my talk at the PyCon typing summit last year. I just generated a big file with 10,000 copies of this class and function:

class Person0:
    def __init__(self, name: str, age: int) -> None:
        self.name = name
        self.age = age

    def as_dict(self) -> dict[str, typing.Any]:
        return {"name": self.name, "age": self.age}

    def __repr__(self) -> str:
        return "Person0('" + self.name + ", " + str(self.age) + ")"

def names_of0(people: list[Person0]) -> list[str]:
    return [p.name for p in people]

class Person1:

Obviously it’s quite synthetic, but I think the annotations there are a roughly reasonable representation of what I typically see.

Not sure what you mean by Prototypes; are you thinking of typing.Protocol? Those are rarely used in the codebases I’m familiar with. It’s mostly just normal inline type annotations.

Both can be an issue. Avoiding unnecessary imports is usually mostly a matter of startup time (and avoiding import cycles), since probably sooner or later all those modules will be imported anyway. But even if you eagerly import everything you use in annotations and don’t use if TYPE_CHECKING to guard imports, PEP 563 saves quite a bit of import-time CPU (and memory) relative to 484. (When I checked last year, 649 was slightly less efficient than 563, as you’d expect, but still much better than 484 behavior.)

Can you clarify more precisely what your concern is? I.e. what sort of code do you anticipate will break or have problems?

It’s not a “serious performance concern” in the sense that it is unusually slow. But it is executing Python code and creating Python objects, and that has a cost, and in a large system that cost adds up to something noticeable. It’s a perfectly reasonable cost if you actually need those Python objects at runtime, because you are introspecting annotations. But there’s no reason to pay that cost for the (e.g.) 95% of annotations that are never introspected.

So I disagree – I think it makes very good sense to find a solution where you only pay for what you use. Which is what PEP 649 does.


Example below

class Bug:...
class Correct:...

class CustomType(type):

    def __new__(mcls, name, bases, namespace, /, **kwargs):
        namespace["Correct"] = Bug
        cls = super().__new__(mcls, name, bases, namespace, **kwargs)
        return cls

class OtherClass(metaclass=CustomType):

#{'val': <class 'Bug'>}
#should be {'val': <class 'Correct'>}

This now causes values to bind to anything that modifies namespace during class creation. It fine when its predictable like when __prepare__ modifies class creation but wildly unexpected and completely backwards incompatible that it occurs during __new__. Further it no longer possible to get annotations prior to calling type.__new__ which is an entire can of worms in itself.

I am going to poke at it some more but I actually think this is also causing the namespace mappings to be kept alive. I confirmed that modifying the class definition after its returned from __new__ does not change the results but that modifying namespace after does. So it appears to be actually bound to whatever object prepare returns. That mapping is as far as I understand it not normally kept alive. That seems that it would have some consequential memory overhead to me on the surface.

This means that any metaclasses that modifies namespace instead of modifying the class has to warn all users that they cannot use a single one of those names as a typehint. Because calling type.__new__ actually has side effects being forced to call it early actually requires the metaclass to deal with stuff it wouldn’t have to normally, namely calling __setname__ manually.

As an example where this would quickly explode. A metaclass that adds a method called dict to a class will now result in at best a function being assigned as the value.

It appears this actually just outright crashes annotations and claims it has no method. I presume this is bubbling up incorrectly from the function which doesnt have a __class_getitem__ or a get __get_item__ method.

class Bug:...
class Correct:...

def makedict(self):
    return dict(self)

class CustomType(type):

    def __new__(mcls, name, bases, namespace, /, **kwargs):
        namespace["dict"] = makedict
        cls = super().__new__(mcls, name, bases, namespace, **kwargs)
        return cls

class OtherClass(metaclass=CustomType):


I wouldn’t exactly call having a class with a function named dict to be rare either. This is code that runs today will not run under PEP 649 due to side effects of lazy binding the names.

And yes that was exactly what I meant I have been writing too much javascript this year clearly lol.

I can move these to the git repo for co_annotations but it seems to me to be a fundamental side effect of how it works and not something that is going to be easily overcome.

1 Like