Adding a Proxy type to typing/typing_extensions

What are your thoughts on adding a type Proxy[T]? This would allow LSP autocomplete for members of T while still allowing a distinction between T and the proxy. Relevant issue - Wrapper/Proxy Generic Type · Issue #802 · python/typing · GitHub

My main questions:

  • How difficult would this be to implement?
  • Is it used widely enough that it’s worth the effort?

That github discussions has a variety of different understanding and assumed semantics for the Proxy type. You should clarify what semantics you want exactly and for what usecase. (Also, should it have runtime behavior?)

What Guido says is still true: This feature is complex enough to require a PEP, and buy-in from multiple static type checkers. This is quite a lot of work someone has to do.

I think this could be useful, as the issue has come up a number of times, but the idea needs to be more precisely defined.

What exactly is a proxy? What operations are proxied (just attributes or also binary/other operations implemented through dunders)? How should isinstance() be handled by type checkers? If a stub says that a function accepts an int, should I be able to pass a Proxy[int]?

What Guido says is still true: This feature is complex enough to require a PEP, and buy-in from multiple static type checkers. This is quite a lot of work someone has to do.

I figured I would create a post here before putting a bunch of work into writing a PEP draft, and see what people thought about the semantics. I did try to get in touch with mypy developers about this to see how much work it would be (and if I could help), but I assume they’re busy people :slight_smile:

the idea needs to be more precisely defined.

Good point! My view is a proxy is a python object that proxies attribute access (including methods and dunder methods) to another object. Currently, I’m imaging a usage something like this

class MyProxy(Proxy[SomeClass]):
    def some_method(self) -> None: ...

Such that an instance of MyProxy should still have access to all the attributes/methods of SomeClass, in addition to the methods/attributes defined on itself. In case of a conflict, the method/attribute defined on itself should be the one used for typechecking. The Proxy generic should have no effect on run-time, so the following should still fail type checking.

proxy_a = Proxy[A]()
assert isinstance(proxy_a, A)  # fails

This way Proxy stays a different type than A. This way type checkers don’t change the type of proxy_a into A, losing all the methods on the Proxy that are not on the proxied type.

If a stub says that a function accepts an int, should I be able to pass a Proxy[int]?

In the ideal world, I would say yes, due to the duck-typing nature of the current type system. However, recognizing that it’s probably a lot of work, I’m open to thoughts on this matter.

If you’re adding a new type system feature, you get to define the ideal world. But I’m not sure what the right answer is here.

The current type system is to a large extent nominal, not based on duck-typing. If you pass a Proxy[int] to a standard library function that accepts an int, there’s a good chance it will break. On the other hand, perhaps a Proxy[T] type would be a lot less useful if it wasn’t made assignable to T.

After some thinking, I don’t think it makes sense for a Proxy[T] to be allowed wherever T is. I would want mypy (or whatever type checker I’m using at the moment) to warn me of something like this:

def foo(x: int):
    if not isinstance(x, int):
        raise TypeError(...)
foo(Proxy[int]())  # should raise error

If some code is certain that a Proxy[T] would work, they could also always do something like

def foo(x: int):
    ...
p = Proxy[int]()
foo(cast(int, p))

Back to the point, being unable to allow Proxy[T] where T is allowed could lower the usefulness of a Proxy in the short term.
However, in the long term (if it sees widespread adoption) it could also serve as a distinction between “checked code” (code like foo above where the input is validated), and unchecked code, where a Proxy would be allowed.

Hmm why not override __instancecheck__? You can make Proxy have custom __instancecheck__ that would lead to isinstance(Proxy[int](), int) being True. It’s unclear to me why proxy that supports forwarding all methods some type X has shouldn’t also be treated like X. We similarly have instancecheck defined for some of collections.abc classes.

I think several concrete examples (even if toy/simplified) are needed. What is your goal of having proxy vs subtype of a class?

Is the proposal going to be to create a new Proxy type in the standard library (possibly with special typing support), or to add some typing-only concept of a Proxy that can describe different proxies that exist in popular libraries? If it’s the former, the proposal will have to explain why this concept is so important and generally useful that it needs to be in the standard library. If the latter, the proposal will have to explain what different existing proxies have in common and how a single new typing concept can cover a substantial subset of existing use cases.

I haven’t personally felt a need for some sort of Proxy type in the type system, but I’ve seen that the concept has come up repeatedly as something people have been asking for. However, it’s not clear to me that they actually all want the same thing. It’s easy for many different users to say they need proxies in some abstract sense, but in practice it may turn out that they all want something else out of it, and it’s impossible to design a type system feature that covers all or even most concrete use cases.

As for __instancecheck__, if we go with the approach of adding a new Proxy class to the standard library, we could add such a method. However, overriding __instancecheck__ is a dangerous mechanism that can lead to confusion, and it doesn’t solve cases where the object is passed to a C function that really only accepts an int, not something that pretends it is an int. If we go with the other approach of adding a type that provides a common structure for third-party types. we should look at whether existing third-party proxy implementations tend to override __instancecheck__, and if their behavior is something we can accommodate in the type system.

7 Likes

Let me explain my use case. Say I have a class Square. square.set_fill(WHITE) should set the fill color of a square, while square.animate.set_fill(WHITE) should animate this action. For this purpose, the .animate property returns a Proxy[Square], and any methods called on it proxy the original square and return another Proxy[Square].
Additionally, the method that actually plays the animation should only accept an AnimationBuilder[Square], since play(sq.animate.set_fill(WHITE)) makes sense but play(sq) doesn’t.
(Here are the docs if you’re interested - note the _AnimationBuilder | Self is needed for accuracy and LSP autocomplete).

Hmm why not override __instancecheck__?

To me, this is bound to cause confusion. For example, as @Jelle mentioned, if you were going to pass a user value to a standard library function (and give a descriptive error) you wouldn’t want this to be the case. I’m also just not a fan of this modifying runtime behavior.

I would prefer to make this a typing/LSP-only concept, although I’m open to thoughts about making it an abstract class similar to contextlib.AbstractContextManager (I don’t think there’s enough of a need for this though…).

From what I’ve seen, most proxies override __getattr__ in some shape or form (unittest.mock is one module, pandas has an IOWrapper, and celery has its own Proxy class!). I’m sure there are other examples, but I haven’t really looked.
As for implementation, right now I’m thinking of something like

class Proxy(Protocol[T]):
    def __getattr__(...) -> object: ...

The Proxy[T] would be special cased in LSPs/static type checkers to have all the dunder methods and attributes of T, in addition to any methods/attributes on the proxy itself.
If anyone has any better way of doing it that wouldn’t involve special casing, please feel free to share.

Wanting to preserve methods/structure of type but not actual type itself in nominal/isinstance sense feels a lot like a Protocol. Could we allow protocol to have an argument to mean a type that has structure but forgets nominal side of it? Then you could define,

class AnimationShape(Protocol[Shape]):

I did coincidentally discuss proxy type as a wrapper at work today. But we opted for different direction and instead did want to preserve nominal type properly. We ended up doing something like,

@cache
def _build_wrapper_type[T: Foo](ty: type[T]) -> type[T]:
  class _WrapperFoo(ty):
    … # maybe add extra methods

  return _WrapperFoo

def WrapperFoo[T: Foo](obj: T) -> T:
  wrapper_type = _build_wrapper_type(type(obj))
  return wrapper_type(obj)

This is also a proxy like use case and can pass through all methods with potential extra ones added. It differs in choosing to keep type structure. It would be nice if there was a way to describe result type better (if proxy adds new methods).

Maybe a better solution would be something like typecheckers checking that a class inheriting from Proxy[T] either has a __getattr__ or implements all the public methods/dunders of T?
That way you could write like

class _WrapperFoo(ty, Proxy[ty]): ...

(I did notice a similar pattern in pandas with Indexes and numpy arrays, but haven’t checked if it’s relevant)

This is in my opinion too restrictive/opinionated. What if my class proxies the __getitem__ calls, but does not implement __getattr__?

As said in this thread, if we need to define what exactly is a proxy – what operations are proxied, I would definitely explore what was proposed in this comment:

from typing import AttrSpec, Generic, assert_type


class MyClass:
    def meth(self) -> int:
        return 1

    def myclass_meth(self) -> str:
        return 'a'


A = AttrSpec('A')


class MyProxy(Generic[A]):
    def __init__(self, proxied: A) -> None:
        self.proxied = proxied

    def __getattr__(self, name: A.name) -> A.value:
        return getattr(self.proxied, name)

    def meth(self) -> bool:
        return True

    def proxy_meth(self) -> str:
        return 'a'


m = MyProxy(MyClass())

assert_type(m, MyProxy[MyClass])

assert_type(m.meth(), bool)  # overrides proxied object
assert_type(m.myclass_meth(), str)
assert_type(m.proxy_meth(), str)

There’s definitely room for improvement with this solution, but at least users are able to choose what method is actually implemented by the proxy class.

1 Like

I spent the weekend thinking about this, and overall I think I like this idea! I just want to clarify a little bit about how AttrSpec would/should work:

class MyProxy(Generic[A]):
    # is this right?
    def __getitem__(self, xyz: A.name) -> A.value: ...
    # or this
    def __getitem__(self, xyz: A.xyz) -> A.value: ...

    # what happens if you use A.name/A.value in non-dunder methods?
    # type error? Or do these just have to match the signature
    # of the method on A?
    def dtype(self) -> A.value: ...
    def compute_something(self, other: A.name) -> A.value: ...

I would also like to suggest maybe the ability to put a bound on the type of items that can be proxied - that would allow for type errors if you did something like:

class Foo: ...

A = AttrSpec("A", bound=Foo)
class FooProxy(Generic[A]):
    def __init__(self, proxy: A) -> None:
        self.proxied = proxy

    def __getattr__(self, name: A.name) -> A.value:
        # do some stuff assuming A is a subtype of Foo
        ...

FooProxy(3)  # type error

What do you think?