The post I quoted talks about user defined subclasses of str. It is with reference to such classes that “you” (a user, authoring such a subclass) have complete control.
Sure, but I would say that any addition to the data model fits user defined types. I don’t think this is really an argument for rcontains.
This thread is already 60 posts deep, in only a week. Having read most of it, and skimmed what I didn’t read carefully, I remain genuinely confused about what problem is being solved here.
“Theoretically I could define a type which makes use of this.” isn’t a use case.
The problem is simply about making the operator generically two-way overloadable like most other operators so that users are free to write an expression with the operator where it makes sense, where readability improves with the operator than without.
I feel that’s going in circles though, and maybe this is just a sign that I should let this thread run and keep my confusion to myself. If I still don’t get it after a couple more messages, I’ll probably drop out since I’m presumably not helping.
I get that ideas like BitString("101") in 21 are “neat”.[1] But where is the problem domain where this new feature would be useful?
Let me put a different example feature on the table, which maybe will make the gap in my understanding clearer.
I “propose” a new keyword and pair of dunders, has and __has__/__rhas__. This allows us to write expressions we didn’t used to be able to write, of the form x has y.
These dunders are undefined for all built-in types.
has can be made to act like in if you want, and it’s free from all historical behavioral baggage.
Why rcontains? Why not has?
Not meant pejoratively, but also intentionally distinct from stronger descriptors like “powerful” or “expressive”. ↩︎
We don’t necessarily have to figure it out right now. Keep the possibilities open and the use cases will come.
Who would’ve thought that / may be used for path concatenation?
Who would’ve thought that @ may denote a tag that annotates a type?
Who would’ve thought that <<, >> and - may be used to draw connectors in a diagram?
Who would’ve thought that | may be used to pipe a function result as input to the next function in a pipeline?
And yet they all make sense in their respective domains and make codes more visually comprehensible even if their new meanings were never thought of when these operators were made overloadable.
has would not be free from all historical baggage because of the presence of in, since x has y is supposed to be semantically equivalent to y in x. You’d have a problem of consistency if x.__has__(y) behaves differently from x.__contains__(y).
IMO not one of those is justification for making the operators overloadable. They are all what happens AFTER you make something that’s worthwhile for other reasons. Overloading division lets you implement other numeric types, like fractions.Fraction. Overloading matrix multiplication… well, that one’s an extreme example, since there is nothing in the core that uses matmul, and it was specifically added, after something like a decade of asking, for the benefit of numpy. Overloading bit shift and subtraction, like division, lets you create your own numeric types. And overloading bitwise Or allows you to create flag-like classes.
All of those use-cases are FAR stronger than the “cute” ones, which wouldn’t be enough to justify the infrastructure. They’re nice when we get them “for free”, but that’s all.
I don’t get why you feel the need to dismiss perfectly sensible reinterpretations of operators as “cute” when they very positively help convey object relations much more clearly than function/method calls can.
If you feel that only first-party usage counts as justification, Serhiy already gave one: mock.ANY in collection where collection isn’t a list. Also, besides the regex example that you found agreeable earlier, Joren also suggested a first-party use in path globbing, such as PurePath('a/b.py') in '*.py'.
Again, keep the protocol open and more justifying use cases will come.
And overloading bitwise Or allows you to create flag-like classes.
Are sets a flag-like class? If not, is using | for set union a “cute” use case?
All of those use-cases are FAR stronger than the “cute” ones,
Are they? The use cases you consider FAR stronger are all numerical in nature. Python is a general programming language, not a language designed to make Sympy or Numpy possible and intuitive. If matrix multiplication and divmod are important enough to add support for in the core language, why not derivatives and integrals?
To take just the / example, I would argue that using it for path concatenation has bigger impact in the overall language ecosystem than being able to overload it for division of some obscure/exotic number types such as Fractions or Quaternions. Dismissing such use cases as just a “cute” byproduct doesn’t do them justice.
Nope; the “cute” one is using bitwise Or as a “pipe” operator. Using it for set union is exactly what it’s designed for; using it to indicate shell-like operations isn’t. The shell-like syntax wouldn’t be enough on its own to justify the operator - but since we have it, all those cute options become available too. I said “for free” because all of the underlying infrastructure is already there (in order to allow sets and set-like objects), and then it’s there to be used for piping data from one function into another.
Yes, they are numerical. Operator overloading is spectacular for numerical types; in fact, it is so extensively used that it’s easy to forget that this is actually overloading at work:
a = 1 + 2
b = 1.5 + 2.25
Yeah, that’s operator overloading, right there. And it’s so completely normal to everyone that it’s easy to forget how valuable it is.
The division operator exists primarily between numbers, including custom numeric types. Being able to use it between paths is a great feature but it would be unlikely to get added just for that. If you disagree, point to any operator that was added first for one of the usages that I described as “cute”.
Note that “cute” does not mean “valueless”. They are absolutely of value! I think it’s awesome that we can have Path objects that overload the division operator. And since __contains__ exists, we can use it for anything we need. However, in order to add something new, there needs to be a compelling use-case. Oh and, I said earlier “something like a decade of asking”; I just checked and it was actually first proposed in 1995 before finally being implemented in 2014, so that’s nearly TWO decades. That’s how hard it is to get something implemented without a use-case in the core data types or standard library.
Maybe; but I would say that there is FAR bigger impact from the fact that you can simply add integers or floats without needing to differentiate.
So, what’s the “killer use-case” that makes __rcontains__ worth adding? Not “if we had it, we could use it for this”, but something that’s powerful and compelling? It’s all very well to say “people will find uses for it”, but no change is free, and the proposals so far aren’t all that strong. The globbing case, for example, could be just as well done with a Glob class that defines containment; then it can recognize "a/b.py" in Glob("*.py") without requiring that it be a Path object.
The math symbol for set union is ∪. Overriding the bitwise operator for it (especially when there is already a set.union method) is as “cute” or normal as using it as a pipe operator (or less so, not sure if I have seen it being used for sets outside Python).
These are builtin types; the language could certainly support operations between integers and floats without introducing __r<op>__. These methods are intended for 3rd party types.
I’m not claiming there is necessarily one (but I would say the same for __rdivmod__ and many other features that made it to the language). For me it’s more about the inconsistency/asymmetry with every other operation than a “killer use case”. It feels like an alphabet consisting of all lowercase letters from a to z and all uppercase letters from A to Y, omitting Z (because “there’s no killer use case for Z”, “you can just do "z".upper() if you need it”, etc).
Sure, and @decorator is just syntax sugar for f = decorator(f). Most things can be reduced to simpler constructs, all the way down to a handful of primitives.
if one reads the docs, one can find out a lot, like checking out 6.10.2. Membership test operations in the language reference, which starts with the following sentence that introduces some semantics.
The operators in and not in test for membership.
How is this realized? The doc states
For user-defined classes which define the __contains__() method, x in y returns True if y.__contains__(x) returns a true value, and False otherwise.
Ok, we have discussed about that. But what happens if there is no __contains__() method? Obviously there is no __rcontains__ that could be called. But this does not stop the interpreter:
For user-defined classes which do not define __contains__() but do define __iter__(), x in y is True if some value z, for which the expression x is z or x == z is true, is produced while iterating over y. If an exception is raised during the iteration, it is as if in raised that exception.
Lastly, the old-style iteration protocol is tried: if a class defines __getitem__(), x in y is True if and only if there is a non-negative integer index i such that x is y[i] or x == y[i], and no lower integer index raises the IndexError exception. (If any other exception is raised, it is as if in raised that exception).
Interesting. So, there is a fallback. How does this interact with the proposal? The proposal does not say.
I checked the whole thread. Nobody mentioned that the operator in is different to the arithmetic operations since it has also fallbacks.
What does the fallback do? It iterates over all elements and checks if the needle can be found in the haystack. Isn’t that what I have said that the semantics of __contains__ should be?
inis one of the several operators (and dot lookup, evaluation in boolean context, etc.) that does more things than just invoking the dunder method. Membership (in) and truthinesss (boolean context) are concepts that are useful for the majority of cases — hence the default fallbacks trying to implement sensible semantics — but are ultimately dependent on desired semantics — hence the dunders are called first before trying any of the fallbacks.
What the change means is that any class (sequence or otherwise iterable) that defines __iter__ or __getitem__ and relies on them to provide a in b functionality (instead of explicitly defining __contains__) will now need to entertain the possibility of an intervening __rcontains__. How many classes and use cases will be affected by this I am not sure, but will be hard to measure, because these sequence or iterable classes are probably generic. Example include custom array list-like implementations, for which the existing built-in __iter__/__getitem__ O(n) fallback test for in membership is otherwise perfectly adequate.
The existing fallback logics can be thought of as the last part of any __contains__, so it doesn’t really complicate the semantics of __rcontains__, which shall become a fallback to all of the existing behaviors.
So the new behavior of the in operator should, with the existing behaviors in the try block, roughly look like:
def in_operator(left, right):
try:
if (contains := getattr(right, '__contains__', None)) is not None:
return contains(left)
if hasattr(right, '__iter__'):
return any(left is i or left == i for i in right)
raise TypeError
except TypeError:
if ((rcontains := getattr(left, '__rcontains__', None)) is not None and
(result := rcontains(right)) is not NotImplemented):
return result
raise
gsakkis wrote several times that they do not have compelling use case for __rcontains__ and they propose it only for consistency.
Your proposed code is not consistent with other __r<op>__ since it differs in two ways:
__rcontains__ would not be the second, but the fourth member that is considered,
__rcontains__ would called if __contains__ or __iter__ (or __getitem__) throws TypeError. However for all other operators, __r<op>__ is called if __<op>__ returns NotImplemented.
It could be made consistent by making the current fallback the default object.__rcontains__:
import itertools
def in_(left, right):
"""Implements `left in right` logic."""
if hasattr(right, "__contains__"):
contains = right.__contains__(left)
if contains is not NotImplemented:
return bool(contains)
if getattr(left, "__rcontains__", None):
is_contained = left.__rcontains__(right)
if is_contained is not NotImplemented:
return bool(is_contained)
raise TypeError(f"argument of type '{type(right).__name__}' is not iterable")
class object:
def __rcontains__(self, container):
if hasattr(container, "__iter__"):
return any(self is item or self == item for item in container)
if hasattr(container, "__getitem__"):
for i in itertools.count():
try:
item = container[i]
except IndexError:
return False
if self is item or self == item:
return True
return NotImplemented
By making user-provided __rcontains__ being the second member to be considered, you would preserve this behavior of some containers (those that define __contains__), while you allow to break it for other containers (those that currently fall back to __iter__ or __getitem__).
The consistency of reflected operators is really about making a binary operator try an operation based on methods of one operand before falling back to methods of the other operand. So the consistency we’re aiming for here is for x in y to consult y first before falling back to consulting x. Whether the fallback is triggered by returning NotImplemented or by raising TypeError is really an implementation detail. Since TypeError is currently raised by an unsupported x in y, I’m suggesting capturing TypeError instead to maintain backwards compatibility.
I should say that it’s going to be a documented detail like the NotImplemented sentinel. It can be documented that the reflected operation for the in operator is triggered by TypeError instead for historic reasons.
By definition the latter are not Containers. Whether a historic fallback for iterable/indexable non-containers should have more precedence than new types that explicitly opt in to provide alternative fallback via __rcontains__ is debatable.