Make `set.symmetric_difference()` take any number of arguments

HarryL · July 17, 2024, 5:35pm

The following methods of set can accept a variable number of arguments:

union(*others)
update(*others)
intersection(*others)
intersection_update(*others)
difference(*others)
difference_update(*others)

However, the symmetric_difference() and symmetric_difference_update() methods accept only a single variable:

symmetric_difference(others)
symmetric_difference_update(others)

I don’t think there will be a problem if we make symmetric_difference() and symmetric_difference_update() similar to other methods, i.e.

symmetric_difference(*others)
symmetric_difference_update(*others)

Rosuav · July 17, 2024, 5:38pm

What’s the symmetric difference between one set and two others?

HarryL · July 17, 2024, 5:41pm

For example, a.symmetric_difference(b, c) would be equivalent to a ^ b ^ c. Given that symmetric difference is commutative, the order does not matter.

HarryL · July 17, 2024, 6:11pm

Sorry for the mistake. The symmetric difference operation is actually commutative and associative.

JamesParrott · July 17, 2024, 6:24pm

Mathematically you’re spot on. That’s a perfectly rigourous, well defined argument.

But I wonder if the current restriction forces the writing of clearer code, and clearer expression of intent?

How many Python users in 2024, intuitively expect the intersection of n sets to be included in the symmetric difference of the same n sets, only when n is odd?

jamestwebber · July 17, 2024, 6:28pm

How many users would use set.symmetric_difference(*others) in the first place? I don’t know that the somewhat unintuitive behavior of this operation is that strong a case against it.

People who are using symmetric_difference should know what they want (and the documentation should be clear about what will happen), and that’s the best we can hope for.

I was pretty sure this has been discussed before, and I can’t remember the outcome of that discussion. edit I was thinking of a different discussion, about adding symmetric_difference to dictionaries, which is a much more confusing operation.

For whatever reason I definitely didn’t think ^ was associative and commutative…given that it is, it seems reasonable to allow multiple arguments in the methods. But there might be a reason I’m missing.

KubaSO · October 2, 2024, 6:02pm

To merge unique elements from multiple sets, so far I have been using reduce(set.symmetric_difference, list_of_sets).

It would be indeed nice if that could be written set.symmetric_difference(*list_of_sets).

I think that symmetric_difference not supporting this is a simple ommission.

bschubert · October 2, 2024, 7:38pm

Uh oh, that’s a bug

The symmetric difference of n sets doesn’t find the unique elements, it finds the elements that have an odd number of appearances. That could be 1 (unique), but it also could be 3, 5, 7, \ldots. Consider:

>>> {1} ^ {1} ^ {1}
{1}

This kind of demonstrates the point made above about n-ary symmetric differences being unintuitive and perhaps not all that useful. “Which elements have an odd number of appearances” is a pretty specialized set operation. Giving it an even more convenient syntax could increase the risk of users confusing it for a different set operation.

KubaSO · October 2, 2024, 7:56pm

Yep. I’m glad I got rid of that code. It was too snafu-prone Good catch - thank you!

jamestwebber · October 2, 2024, 8:11pm

I dunno, it seems pretty unlikely that this change would confuse more people. I’m not sure why a person would arrive at the symmetric_difference documentation and make different conclusions based on whether it accepts one set or multiple. Either way, the documentation itself should be clear about what it is doing.