TL;DR: use a tuple for two values, and a set for three or more, and you will not be far from optimum.
This is an interesting micro-optimization which often is inside a tight loop, so worth considering.
The answer will depend on what values you are matching, and what version of which Python interpreter you are using.
If your values are not hashable, then you can’t use a set. If that case, using a tuple is surely going to be faster and more memory efficient regardless of which Python interpreter you use.
(Some imaginary Python interpreter, let’s call it StoogePython, may be exceptionally bad, and have tuples bigger and slower than lists. But who would use it? Any reasonable interpreter will have tuples at least as efficient as lists.)
Recent versions of CPython will automatically convert a list-display into a tuple:
>>> import dis
>>> dis.dis('x in [1, 2, 3]')
1 0 LOAD_NAME 0 (x)
2 LOAD_CONST 0 ((1, 2, 3))
4 CONTAINS_OP 0
6 RETURN_VALUE
which is nice, but not all interpreters will do that so it is better to explicitly use a tuple.
Likewise, recent versions of CPython will convert a set into a frozenset:
>>> dis.dis('x in {1, 2, 3}')
1 0 LOAD_NAME 0 (x)
2 LOAD_CONST 0 (frozenset({1, 2, 3}))
4 CONTAINS_OP 0
6 RETURN_VALUE
but the same comment about other interpreters and versions applies.
Which is faster, a (frozen)set or a list/tuple?
- Both successful and unsuccessful searches of a set take O(1) comparison;
- unless there are lots of hash collisions;
- successful searches of a list take O(N/2) comparisons on average;
- unsuccessful searches take O(N) comparisons;
- but the overhead of each data structure may be different.
In my experiments, I have found that for two items in the set/tuple:
- successful searches for a tuple are, on average, about the same for a set
- but if one item is very common, and the other is rare, and you put the common item first in the tuple, you can beat the set by about 20%
- sets are always faster for unsuccessful searches.
For three or more items, sets are almost always faster except in the case where the first item in the tuple is very common.
Note that all of these tradeoffs may differ in interpreters which lack CPython’s keyhole optimizer that turns set displays {1, 2, 3}
into a constant frozenset. But even there, you won’t be far from the optimal solution: everything is fast for small enough N, and for large N, the overhead of constructing the set is probably going to be relatively small compared to the overhead of searching a large list.