More of a “thinking aloud” idea than an actual proposal: would it make sense to have an __rcontains__ magic method so that x in y falls back to x.__rcontains__(y) if y does not define __contains__ or returns NotImplemented? Potential uses cases could include:
container-like classes that (intentionaly or by omission) don’t define __contains__
containers that can determine membership only under specific conditions (e.g. for specific member types)
performance considerations (__contains__ returns a bool if it can compute it efficiently, otherwise returns NotImplemented).
How would an object determine if it’s contained in some arbitrary other object? Would this be materially different from simply iterating over the container and comparing self to each value?
I don’t think the OP intends __rcontains__ to test membership against any arbitrary object, but only one of specific types, but even if it does, it can simply return False or NotImplemented if the other object is of an unsupported type.
This may be useful if one wishes to give a custom container meaning to an established type that isn’t technically a container. One very contrived example I can think of is a Bit class that represents a bit position such that Bit(2) in 0b110 is true when an integer is conceptually thought of as a container of bits.
The OP would have a more compelling case if some real-world use cases are given though.
>>> Twice("r") in "racecar"
True
>>> Twice("r") in "pirate"
False
(this already outperforms LLM’s)
I can also imagine some similar funky API’s for regexes, path globs, etc. And I’m sure that dataclass libraries like pandas and polars could also make good use of this, and maybe SymPy too.
It doesn’t have to handle (and probably wouldn’t) handle arbitrary other objects, only selected ones. Here’s a geometric example:
class Circle:
def __init__(self, origin: Point, radius: float):
self.origin = origin
self.radius = radius
def __contains__(self, item):
if isinstance(item, Point):
# Check if point is inside circle
return (item.x - self.origin.x) ** 2 + (item.y - self.origin.y) ** 2 < self.radius ** 2
# I don't know how to handle this type
return NotImplemented
class Triangle:
def __rcontains__(self, container):
if isinstance(container, Circle):
# Check if this triangle is inside the circle
...
if isinstance(container, Rectangle):
# Check if this triangle is inside the rectangle
...
return NotImplemented
Fun example. str.__contains__'s current implementation raises TypeError when the left operand isn’t a str though, so it’ll have to be changed to return NotImplemented instead.
Your example doesn’t really have a “container” and “element”, so I would expect it’s just as easy to ask if a circle’s in a triangle as to ask if a triangle’s in a circle. Which suggests to me that there’s no real distinction between writing the code in __rcontains__ and writing it in __contains__, since either way, you have to have that function know about both types. It really looks to me like this would be much better handled at a higher level Shape class with some sort of registration system.
To be quite frank, this looks like it’s lacking in any concrete use-case, and without that, it’s not very practical to discuss as an idea.
I think the OP’s example is meant to showcase that if one writes a library of Circle and Point, another developer can add a new shape class Triangle that performs the container test without modifying the library.
Maybe?? It seems highly unlikely that it would all make sense like that, though, without actual cooperation from the Circle class. Hence the need for a form of registration - that is, the developer of Triangle can register it so the single __contains__ method can figure out which parts need to care about each other.
Okay, sure, but then the Triangle class needs to know about every single thing that it could intersect with. Seems pretty messy and arbitrary.
But hey. If someone’s ACTUALLY doing this, then I want to hear from them. It’s kinda useless to argue about what would make sense in terms of different developers building shape classes.
Let’s take Joren’s suggested use in regex as an example. I imagine that some cool regex library would allow a regex test with regex(r'\bHello\b') in 'Hello Python', which is arguably more readable than re.search(r'\bHello\b', 'Hello Python') because it conceptually makes sense for a string to contain a pattern. And yet a string is a built-in type, so __rcontains__ is what can make this happen.
class SomeContainer:
def __init__(self):
# accepts only hashable items
self._members = set()
def add(self, item):
self._members.add(item)
def __contains__(self, item):
try:
return item in self._members
except TypeError:
return NotImplemented
class Unhashable:
def __init__(self, x):
self._x = x
__hash__ = None # explicitly unhashable
def __rcontains__(self, container):
if isinstance(container, SomeContainer):
return self._x in container
return NotImplemented
if __name__ == '__main__':
c = SomeContainer()
c.add(1)
u = Unhashable(1)
print(1 in c) # True
print(u in c) # True via __rcontains__
To reiterate, I don’t have a specific problem I need to solve in mind; this is more of a “what if” suggestion for extending naturally a concept (reflected operations) that already exists for quite a few operations to an operation that doesn’t support it yet (membership check). I realize that adding a feature for the sake of completeness is not a strong argument on its own so it’s nice to brainstorm about potential concrete use cases.
Since existing __contains__ implementations do not return NotImplemented, and there is no good way to return it early in, say, list.__contains__, __rcontains__ should be called first, and __contains__ should only be called if it returns NotImplemented. This is opposite to the order of other binary operators, so it is better to chose differtent name, e.g. __in__.
I have one obscure use case in mind: mock.ANY in collection is equivalent to len(collection) > 0 for list, but it does not work for set and dict. Although you can need this very rarely, and you can write tests in other ways.
There is technical problem, For performance, we need to add a slot in the type object. It cannot be added in tp_as_sequence or tp_as_mapping groups, it should be added at the top level, taking space in every type object. It will be incompatible with non-heap types.
Is this not really just about wanting to use the operator “in” rather than simply using a normal function (or method)?
It seems like a lot of work for the debatable convenience of using an operator for one particular task, when everything else will likely still have to be handled with functions (for example, for shapes, there’s not just “in”, there’s also “overlaps”).
I suspect all real world use cases are currently just using functions.
Probably. And so far, we’ve had one suggestion (regex in str) that is indeed being done with functions. I don’t think it’s all that compelling on its own, but could be convinced otherwise if there are other strong examples.
And that’s the big problem with toy examples. They inevitably have other, usually better, explanations.
I worry about the mismatch of an object deciding it is contained by another object, but iterating that other object does not return the thing it supposedly contains. While uses can be thought of, I think they are outweighed by the confusion and frustration when ‘something in some_iterable’ is true but something is not found when iterating some_iterable.
Whilst introducing more __r<method>__ dunders to Python is definitely something I’m in favor for, I have yet to see any use cases, where a __rcontains__ would make that much sense. Sure, the example with reg eyes is quite nice, but that also cen easily be expressed in a nicer way.
I like the idea, but I don’t really see that much need for it, at least not yet. I’d love to see this thread develop though, which is why I’m somewhat ±0.
I’d love to see some more examples, perhaps then I’ll change my mind, but i think a lot of people currently have the same opinion, where they just have not (yet) seen enough convincing examples.
Out of curiosity, was there similar scrutiny and skepticism when other __r<method>__ were proposed? What are some compelling examples of __rpow__, __rdivmod__ or __rxor__?