Discrepancy between set.or and set.union

ZeeD · November 30, 2021, 5:00pm

From an “high” point of view | operator and .union method are very similar in the set context.

{1,2,3} | {4,5,6} == ({1,2,3}).union({4,5,6})

The .union method, however is “more lenient”, as it accept any iterable.
The | operator instead trows a TypeError if I try to join a set with any “non set” iterable.
Is there a reason why | instead is strict?

I discovered this behavior because I wanted to create a set of integers from a list of ranges, and hoped to write something like

idxs = set()
for r in ranges:
    idxs |= r

cameron · November 30, 2021, 8:55pm

From an “high” point of view | operator and .union method are very
similar in the set context.
{1,2,3} | {4,5,6} == ({1,2,3}).union({4,5,6})
The .union method, however is “more lenient”, as it accept any iterable.
The | operator instead trows a TypeError if I try to join a set with any “non set” iterable.
Is there a reason why | instead is strict?

Probably a type safety intent. The .union etc methods are for adding (or
removing, whatever) various elements from the set - those elements might
be in various forms so accepting any iterable is both feasible and
convenient to the user.

However, an expression with sets:

set1 | set2

has more predictable behaviour if you’re sure all the operands are sets.
For one thing, if they’re both sets then the above is cummutable:

set1 | set2 == set2 | set1

If set.__or__ accepted nonsets then that wouldn’t hold, as you’d be
using set2.__or__ in the second expression. If that’s a different type
then you’ll get a different operation entirely. With a pure iterable
(maybe a range()) you’ll get a loud TypeError showing the issue,
but supposing it were a list (well, some collection accepting |)? It
might quietly produce another list, not the outcome you might hope.

Better to enforce tighter constraints, and leave operations which do
“conversions” (such as this iterable->set of elements situation) to
named methods whose behaviour is more overt.

I discovered this behavior because I wanted to create a set of integers from a list of ranges, and hoped to write something like
idxs = set()
for r in ranges:
   idxs |= r

This is better written as:

idxs.update(r)

Note here that .update modifies idx itself. Using |= would make a
new set like .union() does. Slower! More memory!

>>> s=set()
>>> s0 = s
>>> s.update((1,2,3))
>>> s
{1, 2, 3}
>>> s is s0
True
>>> s2=s.union((4,5,6))
>>> s2
{1, 2, 3, 4, 5, 6}
>>> s2 is s0
False

Cheers,
Cameron Simpson cs@cskk.id.au

Topic		Replies	Views
Dict.union() method Ideas	4	620	August 13, 2021
Make set.union and set.intersection convert a non-set self implicitly Ideas	31	1088	December 24, 2023
Add proper/strict kwarg to set.issubset/issuperset Ideas	18	1017	August 20, 2022
Confusion regarding a rule in The Zen of Python Python Help	8	1379	May 23, 2022
Type intersection and negation in type annotations Ideas typing	42	2600	February 19, 2023

Discrepancy between set.__or__ and set.union

Related Topics

Discrepancy between set.or and set.union