Reflected __contains__

Ok there’s a missing type(left) is not type(right) and ... before the getattr. This wasn’t intended to be a PEP-ready reference implementation but thanks for the pedantic correction.

Thanks for clarifying. My comment was not pedantic, but essential! But, ok, not everyone sees that immediately. With your changes, you would break the following, currently working code:

>> class MyStack:
...     def __init__(self):
...         self._data = []
...     def append(self, v):
...         self._data.append(v)
...     def __getitem__(self, i):
...         return self._data[i]
...         
>>> 
>>> m = MyStack()
>>> m.append(1)
>>> 1 in m # Would still be valid
True
>>> m in m # If calling __getitem__ was implemented as a part of object.__rcontains__, this would not work
False

You claim that calling __iter__ and __getitem__ for the in operator was a historic fallback. That is untrue. The docs clearly state that calling __getitem__ was old-style. The docs do not state this for __iter__.

I see, interesting. Well the specific example (m in m) is pathological, I’d go as far as suggesting it should be special-cased in the in implementation to always return False without calling __contains__: if left is right or left == right: return False. If a custom type evaluates m in m as True, it deserves to be broken; I’d consider the change as a bugfix :wink:

A more sane/realistic example would be something like [2] in [[1], [2], [3]] (=True) but for custom iterables without __contains__; these would indeed break without falling back to object.__rcontains__. How about this change?

     if getattr(left, "__rcontains__", None):
-        is_contained = left.__rcontains__(right)
+        if type(left) is type(right):
+            is_contained = object.__rcontains__(left, right)
+        else:
+            is_contained = left.__rcontains__(right)
         if is_contained is not NotImplemented:
             return bool(is_contained)

left in right doesn’t call type(left).__rcontains__ when type(left) is type(right) (even if overriden); instead it calls object.__rcontains__ which is the current fallback behavior.

>>> a = []
>>> a.append(a)
>>> a in a
True
6 Likes

It’s pathological, agreed, but x in x is a legitimate way of testing for pathological data.

>>> x = []
>>> x.append(x)
>>> x in x
True

If you want to detect structures with loops like this, x in x is a legitimate way of doing so - it may not be the best way, but it is currently valid, working code.

Python’s backward compatibility policy states that breaking changes must follow the established deprecation process. It doesn’t prohibit breaking changes, but it requires that “incompatibilities should have a large benefit to breakage ratio, and the incompatibility should be easy to resolve in affected code”. So you need to demonstrate that the benefits of your proposal significantly outweigh the amount of code that will be broken, and show how affected code can be easily fixed. And you need to clearly understand the timescale and process involved in the deprecation policy, and incorporate it into your proposal.

Comments like “If a custom type evaluates m in m as True, it deserves to be broken” show a fundamental misunderstanding of Python’s backward compatibility policy.

3 Likes

Of course one could do that. But this would again special case in compared to other operators.

But this would send us back to square 1:

TIL Python has an answer to the Russell’s paradox :grinning_face:

Regardless, this was just a side note / flippant remark; the proposed change would handle the pathological case too.

It would be no more (or less) special than it is today.

Wasn’t making in consistent with all the other operators your main motivation?

It would be consistent to the extent allowed by backwards compatibility. Also the inconsistency is inconsequential:

  • it’s only about the <container_of_type_t> in <container_of_same_type_t> edge case
  • it makes in more powerful than other operators (that raise TypeError in this case), not less.

Summary table:

left + right left in right
type(left) != type(right) left.__add__(right)
right.__radd__(left)
TypeError
right.__contains__(left)
left.__rcontains__(right)
TypeError
type(left) == type(right) left.__add__(right)

TypeError
right.__contains__(left)
object.__rcontains__(left, right)
TypeError

Sorry that I have to say this, there are even more edge cases. If type(b) is a subclass of type(a), then for the evaluation of a+b, the function type(b).__radd__ is considered first. type(a).__add__ is only a fallback.

(Edit: Since you special case in in anyway, there is no advantage in attempting to hide the other fallbacks in object.__rcontains__. Quite the opposite: it could slow down those code paths.)

Oh lawd more things to think about :weary_face:

Not sure wdym here. object.__rcontains__ may be called explicitly (if type(left) is type(other)) or implicitly (if type(left) != type(other) and type(left) does not override __rcontains__). It does not hide the other fallbacks; just the opposite, it exposes them so that a type may override __rcontains__ to call super().__rcontains__.

@gsakkis, are you pursuing this topic because you believe that if you can resolve all of the details in a coherent way, this will be likely to be implemented and added to the language? Or just for fun as an intellectual exercise?

My expectation is that you can’t find a path here which is self consistent and clear, but doesn’t have some degree of backwards compatibility problems. I don’t think such a resolution exists, because contains is specially defined to have fallbacks that other comparators do not have.

Absent use cases which are your target, I don’t know how you can purely theoretically explore the various trade-offs. The trade-off decision making process for “we have to make some kind of breaking change or do something special/weird” is typically driven by the use cases.

1 Like

There’s no evidence of any insurmountable backwards compatibility issue so far. The main issue is lack of compelling real-world uses cases, we agree on that.

Although I super strongly disagree – there is an inherent tension between trying to make containment look like other operators and the fact that it has a fallback protocol – that doesn’t answer my question.

I was basically tapped out here, but I saw your recent comment, “Oh lawd more things to think about” and it made me think… If this topic is causing you any distress, even mild, I advise you to just drop the idea and spend your time on something more fun and productive.

The primary precondition for actually making a language change (having a real reason to make that change) remains unmet. If you are under the impression that you’re somehow on course towards getting this change into the language, but you’re struggling with it or stressed, I think it would be unfair and unkind to you not to speak up.

2 Likes

I agree, and I’d go a step further here. It seems to me that there’s essentially no chance of this proposal getting accepted into the Python language. As long as people are enjoying the discussion, by all means continue (language design debates can be both fun and educational!) But anyone who’s not enjoying the discussion for its own sake should feel free to drop out, because at this point, there’s no other purpose to this thread.

4 Likes

This is the part I have most trouble coming to terms with. Why can a type say “I don’t know how to add Foo to myself, let Foo figure it out” but every container must say “I have to know if Foo is contained in me, otherwise a TypeError will be raised”? Implementation details aside, is there a deep inherent reason for this asymmetry?

Thanks for the concern but rest assured there’s no distress whatsoever. I realize written communication can be tricky and tongue in cheek can be misinterpreted but it’s all in good fun.

I tend to agree and even if secretly hoping that someone will prove me/us wrong and come up with real-world use cases, I’d consider a convincing argument of why containment is inherently different from every other operation a successful outcome. Until then, “absence of evidence is not evidence of absence” as the saying goes.

It’s a container’s job to contain things.

Not adding anything to the discussion that hasn’t been brought up already, just reiterating some questions/points:

  • What is a container? Even today it’s not limited to types that define __contains__; for better or for worse or all iterables and indexables are implicitly containers too. There is no inherent reason we can’t take this even further and let the caller decide what to consider as a container, including types whose author didn’t think of them as containers.
  • What are the “things” that a Circle contains? Points? Line segments? Other circles? All of the above? None of the above? Who is to say?

In terms of this discussion, a container is anything that contains things. If something can be in your object, it is a container. Since range(10) contains the integer 5, it can be said to be a container; and it’s the range object’s job to determine that, not the integer’s.

The Circle gets to say what it contains. If you’re creating that class, you get to decide. Maybe, for your purposes, it should contain more than one kind of thing. That’s completely up to you, as the creator of the container class.