Why are a set and frozenset with `==` `True` even though their types are different?

A list and tuple, set or frozenset with == are False because their types are different as shown below:

print([10] == (10,))           # list & tuple
print([10] == {10})            # list & set
print([10] == frozenset([10])) # list & frozenset
# False

But a set and frozenset with ==are True even though their types are different as shown below:

print({10} == frozenset({10})) # set & frozenset
# True

So why are a set and frozenset with ==`True` even though their types are different?

1 Like

Usually if two containers have the same equality of all contained elements and they have the same properties that might matter for equality, they can be compared.

From that point of view, frozenset and set are exactly the same. The only difference is that frozenset is frozen and set isn’t, which doesn’t matter for equality of containers.
They both have unique, hashable elements and are unordered.
You can also use other operations like | (union), - (difference), & (intersection) between frozenset and set as well, even with dict.keys and dict.items, because they are set-like.

tuple and list are not comparable to one-another because their use case is usually very different.

tuples’ usual purpose is to represent a heterogeneous, usually finite, ordered container.
lists’ usual purpose is to represent a homogeneous, unspecified in length, ordered container.

That’s the reason why…

  • … for tuples you can specify each index’ type in the type system explicitly (tuple[int, str]) and for non-finite tuples, special syntax exists (tuple[int, ...]), but lists, just like other containers, can be at best specified as list[int | str].
  • … namedtuple exists, but namedlist wouldn’t make sense.

Why not?

Sets have very distinct order relations wrt other containers (subsets/supersets). So, while we have following in docs (strictly speaking, it’s a lie):

Objects of different types, except different numeric types, never compare equal.

There are also documented comparisons for sets:

Both set and frozenset support set to set comparisons. Two sets are equal if and only if every element of each set is contained in the other (each is a subset of the other). A set is less than another set if and only if the first set is a proper subset of the second set (is a subset, but is not equal). A set is greater than another set if and only if the first set is a proper superset of the second set (is a superset, but is not equal).

Instances of set are compared to instances of frozenset based on their members.

Maybe that can be reworded as “seldom compare equal”, which gives the same broad expectation but isn’t false?

This seems vague. Maybe something like this: “Objects of different types, unless documented otherwise, never compare equal.” Exception for numeric types is already documented, like for sets:

A comparison between numbers of different types behaves as though the exact values of those numbers were being compared.

2 Likes

It is, deliberately so. This is just giving very broad information about how the operators themselves work, and isn’t trying to give you everything you need to know about all types of objects. It’s still GENERALLY true that objects only compare equal to others of the same type (a datetime isn’t equal to a file, for example, even if it happens to be the file’s last-modified date).

This seems like the right approach. Or even “Objects of different types will never compare equal unless that functionality has been specifically added.” This covers user-defined code as well.

Although this doesn’t need to fit in one sentence, there’s a whole section there.

You can define your type that its instances are equal with anything:

class EqualClass:
    def __eq__(self, other):
        return True

I agree with the OP that it seems inconsistent that `frozenset` and `set` are equals whereas `tuple` and `list` are not.

Is there a place where the decision of frozenset equality was discussed so that I can understand the reasoning behind it?

It seems inconsistent if you think about a tuple as a “frozen list”, but that’s not its primary purpose. A frozenset’s entire purpose in life is to be a set, but frozen. A tuple’s main goal is to hold a collection of well-defined things; for example, if you have a function that returns more than one value, what it’s actually doing is returning a tuple containing those values, which you can then unpack:

def wait_for_item():
    ...
    return item, time_taken

item, delay = wait_for_item()

There’s a two-element tuple here. Unlike a list, which generally contains a collection of similar items, a tuple often contains very dissimilar items. It also seldom makes sense to remove one item from a tuple and close up the gap, whereas that’s a very common thing in a list.

Tuples, since they do very different things from lists, shouldn’t be thought of primarily as “frozen lists”; they are certainly capable of being that, but they’re so much more.

6 Likes

set and frozenset both represent the same underlying mathematical object. The distinction between the two types is “merely” operational. list and tuple, on the other hand, both represent very different families of types in type theory. (Roughly speaking, tuple represents arbitrary finite type products, and list represents different least fixed points of a particular functor. As long as you recognize that these two descriptions sound very different, we don’t need to spend any time explaining what either one means.) There’s little reason to expect a tuple and a list to be comparable in this way mathematically.

1 Like

This is a nice and neat rationale but Python doesn’t adhere to it in practice. Historically there’s some ambiguity in how they are used (and the syntax is even more ambiguous, e.g. [a, b] = ...)

So I can understand why they could compare as equal. There’s one obvious way that would work. But that’s not how the language is and I don’t think there is a good enough reason to change it.

2 Likes

Indeed I’ve sometimes used tuples as “frozenlist”, basically type hinting

x: tuple | list

in situations where I was mostly working with lists but needed frozenness for safety in some situations. (eg as ClassVar, or as a default value of an argument. Or just for consistency when building a data structure consisting of nested frozen=True dataclasses.)

I don’t think () != [] has ever bit me.

The only time it trips me up is in exploratory analysis where I go down a path like

foo = [...]
bar = [...]
foobar = list(zip(foo, bar))

# ... do stuff with foobar for a while ...

# wait, is the ordering what I think it is?
foo2, bar2 = zip(*foobar)
assert foo == foo2  # not equal because foo2 is a tuple

And a monad is a monoid in the category of explanations that sound extremely mathematical and complicated :slight_smile:

1 Like

For historical reasons. Search the PEP 218 discussions on the Python-Dev mailing list: Mailman 3 Search results for "PEP 218" - Python-Dev - python.org

And this looks like a remnant from Python 0.9.8-1.1.1 when it was true. The support of customization for user classes was added in Python 1.2, and coercion of 8-bit and Unicode string was added in Python 1.6. Please open an issue.

2 Likes
1 Like

Probably worth documenting that bytes-like objects can compare equal to each other even if they are of different types too:

>>> bytearray(b'') == memoryview(b'') == b''
True

That’s why I prefer the vaguer wording here. This is a basic explanation of the comparison operators, not an exhaustive list of every data type and how they behave. Saying that objects of different types “seldom” compare equal gives the right impression without being false (as is the current wording), but also without being overly verbose. I don’t think that either sets or bytestrings need to be explicitly called out here.

2 Likes

Yeah, but the wording “seldom” makes it unclear when exceptions are made. I would document the guiding principle instead, something like “objects of different types should not compare equal unless the types are conceptually compatible”.

Such verbiage would help cover collections.UserDict() == {} and other user types too.