Add symmetric difference to collections.Counter

I recently noticed that collections.Counter objects support asymmetric difference, union, and intersection, but they don’t support symmetric difference.

So, given these Counter objects:

>>> from collections import Counter
>>> c = Counter("ababcabcd")
>>> d = Counter("bananabab")
>>> c
Counter({'a': 3, 'b': 3, 'c': 2, 'd': 1})
>>> d
Counter({'a': 4, 'b': 3, 'n': 2})

This works:

>>> c | d
Counter({'a': 4, 'b': 3, 'c': 2, 'n': 2, 'd': 1})
>>> c & d
Counter({'a': 3, 'b': 3})
>>> (c | d) - (c & d)
Counter({'c': 2, 'n': 2, 'a': 1, 'd': 1})

But this does not:

>>> c ^ d
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for ^: 'Counter' and 'Counter'

I would expect c ^ d to be the same as (c | d) - (c & d), so implementing it should be fairly simple.

This was discovered for the purpose of solving a programming exercise (finding the difference between the letters in two strings, ignoring order). I do imagine that this may be a useful operation for Counter to support though.

9 Likes

cc @rhettinger

In principle this seems reasonable to me, though Raymond is the domain expert here.

A

Is this a real world need? ISTM that programming exercises by design pick tasks that people don’t normally do and for which solutions don’t already exist. Presumably, that is why lists don’t have a method for Longest Increasing Subsequence or other common toy problems.

The issue with symmetric_difference for multisets is that it is hard to interpret the result. When you see p ^ q during a code review, do you think, “elementwise difference between the maximum and minimum”? Does that task arise often enough to warrant inclusion in the standard library? Do we want people to have to take the time to learn a method they will likely never use and which is difficult to interpret? My opinion is that this is best left as a programming exercise.

2 Likes

Stack Overflow with infos from Tim/Raymond:

Why is there no symmetric difference for collections.Counter?

8 Likes

From a readability point of view, I’m inclined to agree with @Wombat: I would mentally parse (c | d) - (c & d) more easily than the direct c ^ d.

I don’t actually object to adding it, though - it’s a well defined operation that’s valid for both sets and multisets, and we already support it for the former.

4 Likes

Yes :wink:. People familiar with the multiset concept, and the concept of ^ for regular sets, don’t have trouble making “the most obvious” generalization.

That’s the rub: no. At least not that I’ve ever seen, and I’m so old I’ve seen everything :wink:.

2 Likes