That github discussions has a variety of different understanding and assumed semantics for the Proxy type. You should clarify what semantics you want exactly and for what usecase. (Also, should it have runtime behavior?)
What Guido says is still true: This feature is complex enough to require a PEP, and buy-in from multiple static type checkers. This is quite a lot of work someone has to do.
I think this could be useful, as the issue has come up a number of times, but the idea needs to be more precisely defined.
What exactly is a proxy? What operations are proxied (just attributes or also binary/other operations implemented through dunders)? How should isinstance() be handled by type checkers? If a stub says that a function accepts an int, should I be able to pass a Proxy[int]?
What Guido says is still true: This feature is complex enough to require a PEP, and buy-in from multiple static type checkers. This is quite a lot of work someone has to do.
I figured I would create a post here before putting a bunch of work into writing a PEP draft, and see what people thought about the semantics. I did try to get in touch with mypy developers about this to see how much work it would be (and if I could help), but I assume they’re busy people
the idea needs to be more precisely defined.
Good point! My view is a proxy is a python object that proxies attribute access (including methods and dunder methods) to another object. Currently, I’m imaging a usage something like this
class MyProxy(Proxy[SomeClass]):
def some_method(self) -> None: ...
Such that an instance of MyProxy should still have access to all the attributes/methods of SomeClass, in addition to the methods/attributes defined on itself. In case of a conflict, the method/attribute defined on itself should be the one used for typechecking. The Proxy generic should have no effect on run-time, so the following should still fail type checking.
proxy_a = Proxy[A]()
assert isinstance(proxy_a, A) # fails
This way Proxy stays a different type than A. This way type checkers don’t change the type of proxy_a into A, losing all the methods on the Proxy that are not on the proxied type.
If a stub says that a function accepts an int, should I be able to pass a Proxy[int]?
In the ideal world, I would say yes, due to the duck-typing nature of the current type system. However, recognizing that it’s probably a lot of work, I’m open to thoughts on this matter.
If you’re adding a new type system feature, you get to define the ideal world. But I’m not sure what the right answer is here.
The current type system is to a large extent nominal, not based on duck-typing. If you pass a Proxy[int] to a standard library function that accepts an int, there’s a good chance it will break. On the other hand, perhaps a Proxy[T] type would be a lot less useful if it wasn’t made assignable to T.
After some thinking, I don’t think it makes sense for a Proxy[T] to be allowed wherever T is. I would want mypy (or whatever type checker I’m using at the moment) to warn me of something like this:
def foo(x: int):
if not isinstance(x, int):
raise TypeError(...)
foo(Proxy[int]()) # should raise error
If some code is certain that a Proxy[T] would work, they could also always do something like
def foo(x: int):
...
p = Proxy[int]()
foo(cast(int, p))
Back to the point, being unable to allow Proxy[T] where T is allowed could lower the usefulness of a Proxy in the short term.
However, in the long term (if it sees widespread adoption) it could also serve as a distinction between “checked code” (code like foo above where the input is validated), and unchecked code, where a Proxy would be allowed.
Hmm why not override __instancecheck__? You can make Proxy have custom __instancecheck__ that would lead to isinstance(Proxy[int](), int) being True. It’s unclear to me why proxy that supports forwarding all methods some type X has shouldn’t also be treated like X. We similarly have instancecheck defined for some of collections.abc classes.
I think several concrete examples (even if toy/simplified) are needed. What is your goal of having proxy vs subtype of a class?
Is the proposal going to be to create a new Proxy type in the standard library (possibly with special typing support), or to add some typing-only concept of a Proxy that can describe different proxies that exist in popular libraries? If it’s the former, the proposal will have to explain why this concept is so important and generally useful that it needs to be in the standard library. If the latter, the proposal will have to explain what different existing proxies have in common and how a single new typing concept can cover a substantial subset of existing use cases.
I haven’t personally felt a need for some sort of Proxy type in the type system, but I’ve seen that the concept has come up repeatedly as something people have been asking for. However, it’s not clear to me that they actually all want the same thing. It’s easy for many different users to say they need proxies in some abstract sense, but in practice it may turn out that they all want something else out of it, and it’s impossible to design a type system feature that covers all or even most concrete use cases.
As for __instancecheck__, if we go with the approach of adding a new Proxy class to the standard library, we could add such a method. However, overriding __instancecheck__ is a dangerous mechanism that can lead to confusion, and it doesn’t solve cases where the object is passed to a C function that really only accepts an int, not something that pretends it is an int. If we go with the other approach of adding a type that provides a common structure for third-party types. we should look at whether existing third-party proxy implementations tend to override __instancecheck__, and if their behavior is something we can accommodate in the type system.
Let me explain my use case. Say I have a class Square. square.set_fill(WHITE) should set the fill color of a square, while square.animate.set_fill(WHITE) should animate this action. For this purpose, the .animate property returns a Proxy[Square], and any methods called on it proxy the original square and return another Proxy[Square].
Additionally, the method that actually plays the animation should only accept an AnimationBuilder[Square], since play(sq.animate.set_fill(WHITE)) makes sense but play(sq) doesn’t.
(Here are the docs if you’re interested - note the _AnimationBuilder | Self is needed for accuracy and LSP autocomplete).
Hmm why not override __instancecheck__?
To me, this is bound to cause confusion. For example, as @Jelle mentioned, if you were going to pass a user value to a standard library function (and give a descriptive error) you wouldn’t want this to be the case. I’m also just not a fan of this modifying runtime behavior.
I would prefer to make this a typing/LSP-only concept, although I’m open to thoughts about making it an abstract class similar to contextlib.AbstractContextManager (I don’t think there’s enough of a need for this though…).
From what I’ve seen, most proxies override __getattr__ in some shape or form (unittest.mock is one module, pandas has an IOWrapper, and celery has its own Proxy class!). I’m sure there are other examples, but I haven’t really looked.
As for implementation, right now I’m thinking of something like
class Proxy(Protocol[T]):
def __getattr__(...) -> object: ...
The Proxy[T] would be special cased in LSPs/static type checkers to have all the dunder methods and attributes of T, in addition to any methods/attributes on the proxy itself.
If anyone has any better way of doing it that wouldn’t involve special casing, please feel free to share.
Wanting to preserve methods/structure of type but not actual type itself in nominal/isinstance sense feels a lot like a Protocol. Could we allow protocol to have an argument to mean a type that has structure but forgets nominal side of it? Then you could define,
class AnimationShape(Protocol[Shape]):
…
I did coincidentally discuss proxy type as a wrapper at work today. But we opted for different direction and instead did want to preserve nominal type properly. We ended up doing something like,
This is also a proxy like use case and can pass through all methods with potential extra ones added. It differs in choosing to keep type structure. It would be nice if there was a way to describe result type better (if proxy adds new methods).
Maybe a better solution would be something like typecheckers checking that a class inheriting from Proxy[T] either has a __getattr__ or implements all the public methods/dunders of T?
That way you could write like
class _WrapperFoo(ty, Proxy[ty]): ...
(I did notice a similar pattern in pandas with Indexes and numpy arrays, but haven’t checked if it’s relevant)
This is in my opinion too restrictive/opinionated. What if my class proxies the __getitem__ calls, but does not implement __getattr__?
As said in this thread, if we need to define what exactly is a proxy – what operations are proxied, I would definitely explore what was proposed in this comment:
There’s definitely room for improvement with this solution, but at least users are able to choose what method is actually implemented by the proxy class.
I spent the weekend thinking about this, and overall I think I like this idea! I just want to clarify a little bit about how AttrSpec would/should work:
class MyProxy(Generic[A]):
# is this right?
def __getitem__(self, xyz: A.name) -> A.value: ...
# or this
def __getitem__(self, xyz: A.xyz) -> A.value: ...
# what happens if you use A.name/A.value in non-dunder methods?
# type error? Or do these just have to match the signature
# of the method on A?
def dtype(self) -> A.value: ...
def compute_something(self, other: A.name) -> A.value: ...
I would also like to suggest maybe the ability to put a bound on the type of items that can be proxied - that would allow for type errors if you did something like:
class Foo: ...
A = AttrSpec("A", bound=Foo)
class FooProxy(Generic[A]):
def __init__(self, proxy: A) -> None:
self.proxied = proxy
def __getattr__(self, name: A.name) -> A.value:
# do some stuff assuming A is a subtype of Foo
...
FooProxy(3) # type error
For this we will need to bridge the worlds of nominal and structural typing.
The proposed __getattr__ might work in some cases, but it won’t be able to bear the weight of everything that could cross. For example, it wouldn’t work in case of descriptors, or if there’s a custom __getattribute__ involved. Then there’s also the fact that there are several dunders that take aren’t routed through __getattr__ or __getattribute__.
I’ve been thinking about how to build this bridge for a while now, and here’s what I came up with: We explicitly enforce the assumption that something with __class__: T is a subtype of T. So type-checkers should narrow if x.__class__ is Spam to x: Spam. With that, the Proxy type could be written as
@final # prevent nominal use to side-step the LSP
class Proxy[T](Protocol): # T is covariant
@property
def __class__(self, /) -> type[T]: ...
It’s probably safer to make it invariant, though:
@final # prevent nominal use to side-step the LSP
class Proxy[T](Protocol): # T is invariant
@property
def __class__(self, /) -> type[T]: ...
@__class__.setter
def __class__(self, t: type[T], /) -> None: ...
Some might recognize the latter as the optype.Just type (docs) that I’ve (successfully) been using to work around the int/float/complex promotion rules by writing e.g. Just[float], which rejects int but accepts float. With this explicit __class__ assumption, something like def f(x: Just[float]) -> float: return x will no longer be reported as an error.
As far as I can tell, there are no type-safety issues with the invariant Proxy (or Just) type. I’m not sure about the covariant one, but it might also be safe (enough).
I’m not sure if this would require a PEP or if a spec change would suffice here .
The definition there can’t fully sidestep an LSP issue. all types are subtypes of object, so the definition of __class__ there is a problem with the current definitions, as even protocols have to be compatible with object’s definitions.
I’m slowly[1] making progress on a better set of type definitions and specification language for type and object, as well as “acceptable types” for all methods and attributes defined by python’s data model, but there’s some ugly cases where subtyping relation alone won’t be enough without forbidding certain things. A prior thread on Hashability and Equality covers some of the more obvious issues, but accurately modeling __class__ within the type system is a particularly interesting issue as well.
As far as I’m concerned, that definition currently has undefined behavior because that protocol requires ignoring a conflict with the current definition of object.__class__, and could cease working at any time or not work in one typechecker while working in another.
No ETA, this is something I’m working on as I have the energy and time to, it’s not a pressing concern or a blocker for anything at my day job, etc. ↩︎
If a Protocolcan only be used for structural typing, then the usual nominal type-safety concerns are no longer an issue. I can’t think of any example where a @finalProtocol (that has no other Protocol as baseclass) would cause type-safety issues this way.
I’d like to think of such purely structural protocols as a proposition (in terms of assignability) — a logical statement describing its inhabitants in terms of their structure. For instance, for the covariant Proxy[T] that’d be “some value x s.t. x.__class__: type[T]”.
That sounds like a good idea, nice. And yea I also noticed that at the moment there are some subtyping issues related to the descriptors and the reflected arithmetic dunder methods, which are far from trivial. Even seemingly simple things like self don’t fit well with the traditional subtyping rules (i.e. the LSP).
Technically speaking, yes. But in this case, I can’t think of any other way for type-checkers to behave than like they currently do. Some things just “make sense” because there are no obivious alternatives, even if there’s no typing spec that explicitly specifies it. I at least can’t think of any reason why type-checkers would want to suddenly start treating it differently . But perhaps I’m being naive here, I guess time will tell.