Multiple element contain check for list, tuple

current scenario -
1)

1 in [1, 2, 3] # or, 1 in (1, 2, 3)

True

but, if one wants to check whether more than one element is in a list, then this technique does not work

  1. for a set, there is issubset method, that is,
({1, 2}).issubset({1, 2, 3})

True
  1. one way to get multiple element contain check is through all keyword, that is,
all(i in [1, 2, 3] for i in [1, 2])

True

expected scenario -

  1. there is a method for list, tuple, that works like,
([1, 2]).elements_in([1, 2, 3]), ((1, 2)).elements_in([1, 2, 3])

True, True

By via Discussions on Python.org at 17Jul2022 07:49:

current scenario -

  1. 1 in [1, 2, 3] # or, 1 in (1, 2, 3)

Aye, nice and simple.

  1. for a set, there is issubset method, that is, ({1, 2}).issubset({1, 2, 3})
    […]
  2. one way to get multiple element containment check is through all
    keyword, that is, all(i in [1, 2, 3] for i in [1, 2])
    […]
    expected scenario -
  3. there is a method for list, tuple, that works like,
([1, 2]).elements_in([1, 2, 3]), ((1, 2)).elements_in([1, 2, 3])

If you want distinct results, why not just?

1 in [1, 2, 3], 2 in [1, 2, 3]

You want write loops or comprehensions for longer forms of this.

Note that sets are easy to make. If you only want the result of:

1 in [1, 2, 3] and 2 in [1, 2, 3]

you can also write that as:

{1, 2} in {1, 2, 3}

or if you have a list/tuple:

to_check = 1, 2  # a tuple
the_list = [1, 2, 3]

if set(to_check).issubset(set(the_list)): ...

or of course:

if set(to_check) < set(the_list): ...

I think I’m basicly saying: if you’re doing set membership tests, use
sets. There’s little benefit to adorning list/tuple with a heap of extra
special purpose methods. If you take the convert-to-sets approach you
can work with other sequences or iterables too (versus just lists/tuples
becausse they would have tese special methods).

Cheers,
Cameron Simpson cs@cskk.id.au

That doesn’t do what you think it does: it looks for a single set element equal to the set {1, 2}.

I think you want the subset comparison:

{1, 2} <= {1, 2, 3}

Why do you expect this? Is there something about Python that leads you to expect that the language includes every imaginable, trivial one-line function as a built-in method?

Just use sets. Or write all(a in [1, 2, 3] for a in [1, 2]).

Any feature that you think of that can already be solved in one line of code is very unlikely to be turned into a builtin method or function.

1 Like

I think the confusion comes from this abnormal (but otherwise useful) behaviour of the in operator:

>>> 'bc' in 'abcd'
True

Is this the only inconsistency of the in operator?

Is this a side effect of strings being characters in a sequence object?
What’s happening under the hood to make 'bc' in 'abcd' -> True?
Is it likely to change in a future Python version?

As a footnote, this evaluates to True:

'a' and 'b' in {'a','b','c','d'}

By Leland Parker via Discussions on Python.org at 19Jul2022 04:37:

Is this a side effect of strings being characters in a sequence object?
What’s happening under the hood to make 'bc' in 'abcd' -> True?

That is because with strings, in means “substring present in”.
Remember that operations in Python are largely driven by the class of
the objects being operated on. So 1+1 does arithmetic addition, but
'ab'+'def' does string concatenation. The in operator calls the
class’ __contains__ method. For a str that tests for the presence of
a substring.

Is it likely to change in a future Python version?

No.

As a footnote, this evaluates to True:

'a' and 'b' in {'a','b','c','d'}

Yes, but not for useful reasons:

  • 'a' is true because it is a nonempty string
  • 'b' in {'a','b','c','d'} is true because 'b' is in the set of
    strings (there is an element e in the set where e=='b'; again,
    this is what set.__contains__ does
  • and therefore the conjuction of the 2 conditions is also true

Cheers,
Cameron Simpson cs@cskk.id.au

Containment tests (the in operator) for strings are defined to check for the presence of a substring. See the footnote (1) in the docs for sequences.

No, it won’t change in the future.

Under the hood, what happens is that the in operator calls its right-hand operand’s __contains__ method, which for strings and bytes implement substring matching rather than element matching.

Your code snippet:

'a' and 'b' in {'a','b','c','d'}

is not testing whether ‘a’ is in the set. Try this:

'Not in the set' and 'b' in {'a', 'b', 'c', 'd'}

which will also return True. Python often reads very much like English, but this example is a false friend.

Rather than testing if ‘a’ is in the set and ‘b’ is also in the set, like English grammar would suggest, it is testing whether ‘a’ is a truthy value and ‘b’ is in the set. Since both expressions are true, the second is returned – which happens to be the True constant.

Try reversing the test and see what you get:

'b' in {'a','b','c','d'} and 'a'

The and and or operators are short-circuit operators. They work something like this:


def and_(a, b):

    if not a:

        return a

    else:

        return b



def or_(a, b):

    if a:

        return a

    else:

        return b

except that the second operand (b) isn’t evaluated unless it is needed.

Well, at first I thought it was because I glibly split up the strings in Václav’s example and hit [Submit]. I typically explicitly use parentheses for both readability and to control of order of operations.

However, parentheses did nothing to change the behavior. The Common Sequence Operations table says nothing about parentheses. Is CPython simply ignoring them?

('a' and 'b') in {'a','b','c','d'} -> True
('a' and 'b') in ['b','c','d']     -> True   #'a' not in set
('a' and 'b') in ['a','c','d']     -> False  #'b' not in set

I ran out of dots to follow in the Abstract Base Classes (ABC) when trying to trace the class inheritance of strings. How can I trace the class heirarchy for string back to the fundamental ABC?

Thanks, Stephen. I did find that note during my search:

While the in and not in operations are used only for simple containment testing in the general case, some specialised sequences (such as str, bytes and bytearray) also use them for subsequence testing

Quiz (or perhaps trick question), regarding containment, primarily for entertainment purposes :wink: :

What is the most abundant of all substrings?

For the purposes of context, we can arbitrarily consider the locations of occurrences of this substring to be any strings that now reside in memory in Python programs currently executing anywhere in the world.

Any string that can pass a valid test for its containment within any string, including itself, will be considered a viable candidate.

Please give supportive reasoning for your nominations.

1 Like

When you are not sure what does the expression do I suggest you to try to evaluate it as Python would do it:

('a' and 'b') in {'a','b','c','d'}  # First evaluate the innermost parentheses.
'b' in {'a','b','c','d'}            # Now it is clear.
True

Maybe the and operator is confusing. It works as Steven has shown in his alternative implementation: and_(). If both operands give True when converted to bool, it gives the second operand. Otherwise it gives the first operand which evaluates to False.

>>> 'b' and 'a'
'a'
>>> 0 and 'a'
0
>>> 'b' and ()
()

:smiley: I think it cannot be anything different than:

''  # adding an arbitrary comment to not reveal much when blurred...
2 Likes

ditto. :wink:

invisible, yet ubiquitous

The parentheses (round brackets for non-Americans reading) are not ignored. They are used to change the precedence of operations. It just so happens that in the examples you tried, it makes no difference which way you evaluate them!

Analogy:

(1 + 2) - 3 = 0

1 + (2 - 3) = 0

The order of evaluation makes no difference here.

If we evaluate the expression like this:

('a' and 'b') in {'a', 'b', 'c'}

=> 'b' in {'a', 'b', 'c'}

=> True

but if we evaluate it like this instead, we get the same result:

'a' and ('b' in {'a', 'b', 'c'})

=> 'a' and True

=> True

The same applies with these:

('a' and 'x') in {'a', 'b', 'c'}

=> 'x' in {'a', 'b', 'c'}

=> False



'a' and ('x' in {'a', 'b', 'c'})

=> 'a' and False

=> False

S

P

O

I

L

E

R

S

P

A

C

E

The empty string. It is a substring of every string, including itself.

1 Like

yes and yes. ‘eggs’ has 5 of them, as would be expected.

>>> 'eggs'.count('')
5

But, evidently, there’s no such analogous entity concerning a list. Some might suggest that there’s the empty list, however, that doesn’t quite cut it. We don’t have 5 of those here …

>>> ['e', 'g', 'g', 's'].count([])
0

The above, of course, behaves as it should, since ['e', 'g', 'g', 's'] does not contain 5 of [].

There’s probably no potential utility for an emptyish list object that corresponds to the nonexistent spaces between and around the items within a list, but it’s an interesting concept anyway, perhaps practical only as the object of a koan.

1 Like

That’s pretty quirky. I realized that it had something to do with the non-empty sequence evaluating to True but did not try the 'a' and 'b' statement on the command line to see what was returned. Thanks, Václav.

So the return value appears to be the last value that was evaluated in the expression. When the interpreter reaches a condition that produces a conclusive result for the expression, it returns the current object in the evaluation queue.

CONFIRMED: docs.python Boolean Operations says…

[and and or] return the last evaluated argument.

It also has a tricky example of applying a default value if the first value in a or expression returns 'False`. If that’s an intentionally-designed-in behavior, then I guess it counts as “Pythonic”. It’s certainly non-obvious, though.

The short-circuit evaluation exists in many mainstream languages including C, JavaScript, POSIX shell:

In those languages I listed it also returns the last evaluated value (though C and shell are limited to integers). …so it becomes obvious when you get used to it :slight_smile: Certainly it is not obvious for non-programers.

The iterator arguments for all() and any() functions are short-circuit evaluated too (maybe better to say short-circuit iterated):

>>> def true_monitored():
...     print('true_monitored')
...     return 42  # True when converted to bool
...
>>> def false_monitored():
...     print('false_monitored')
...     return {}  # False when converted to bool
...
>>> def lazy_iterable(iterable):
...     for item in iterable:
...         yield item()
...
>>> all(lazy_iterable((true_monitored, false_monitored, true_monitored)))
true_monitored
false_monitored
False
>>> any(lazy_iterable((true_monitored, false_monitored, true_monitored)))
true_monitored
True
>>>

Though these functions return only True/False not the last value.

2 Likes

I meant that the non-boolean return value was nonobvious. I didn’t even think to check for a non-boolean with the command line.

Short-circuiting is common, as you said. (But is an execution shortcut so doesn’t impact a result as returning the evaluated expression element does.)

Some additional documentation that describes short-circuit evaluation regarding and and or can be found at the 5.7. More on Conditions section of the 5. Data Structures page within the The Python Tutorial. It states:

The Boolean operators and and or are so-called short-circuit operators: their arguments are evaluated from left to right, and evaluation stops as soon as the outcome is determined.

… the return value of a short-circuit operator is the last evaluated argument.

2 Likes