Add operator support for mathematical equivalence

Statement of Purpose

This is a suggestion to support a new dunder method that tests for equivalence, but not necessarily equality, between two objects. The corresponding infix operator would be ~.

Example Use Case

Suppose I am defining a class to represent metric units, which users will initialize with a string representation of a given unit (e.g., 'meter', 'J / s', etc.). Further suppose that I want two instances to be equal if and only if their strings represent the same unit, but I want two unequal instances to be equivalent if their strings represent equivalent units. Following this logic 'N' and 'newton' would be equal while 'N' and 'kg * m / s^2' would be equivalent but not equal.

Here is a minimal implementation of the class:

class Unit:

    def __init__(self, arg: str) -> None:
        self.arg = arg

    def __equal__(self, other) -> bool:
        if isinstance(other, Unit):
            return self.arg == other.arg
        # ... more complex logic for cases like 'm' == 'meter'
        try:
            return self.arg == str(other)
        except TypeError:
            return False

    def __equiv__(self, other) -> bool:
        if isinstance(other, Unit):
            # ... implementation of metric equivalence
        if isinstance(other, str):
            # ... possibly alternate implementation of metric equivalence
        return False

Here are some usage examples

>>> Unit('newton') == Unit('newton')
True
>>> Unit('newton') == Unit('N')
True
>>> Unit('newton') == Unit('kg * m / s^2')
False
>>> Unit('newton') ~ Unit('kg * m / s^2')
True

Possible Workarounds

It would certainly be possible to define a function that checks equivalence, as defined by the user, between two objects. It would also be possible to define an instance method or a class method to restrict the scope according to user needs. However, the proposed dunder method and corresponding infix operator would simplify and standardize these use cases.

Relationship to Notational Conventions

The symbol ~, when used between two operands, can have one of multiple meanings. It may be used, formally or informally, to indicate that

  1. an equivalence relation exists between two sets
  2. two functions are asymptotically equivalent
  3. two triangles are similar
  4. a stochastic variable has a given probability distribution or that two stochastic variables have the same probability distribution
  5. two graph nodes are adjacent to each other (i.e., are connected by an edge)
  6. the numerical value of a physical quantity has a given order of magnitude or two physical quantities have the same order of magnitude
  7. two numbers are approximately equal (e.g., a \sim b to mean a \approx b)
  8. a quantity is proportional to another quantity (e.g., y \sim x^2 to mean y \propto x^2)

(cf. Wikipedia’s Glossary of Mathematical Symbols, Wikipedia/Tilde, and MathWorld/Tilde)

Of these uses

  • 1, 2, 3, and 4 represent an explicit or implicit notion of rigorous equivalence between two objects, and thus support the proposed use of ~;
  • 5 represents a formal notion of connection independent of equivalence (or equality) that could lead to potential confusion;
  • 6 represents an informal notion of approximation that is weaker than “approximately equal to”;
  • 7 and 8 are abuses of notation.

Similarity to Existing Operators

The token ~ is already in use as a unary operator, which may be implemented on objects by defining __invert__. However, this should not cause confusion, since + and - are each used for a unary (corresponding to __pos__ and __neg__) and a binary operator (corresponding to __add__ and __sub__).

Similarity to Previous Proposals

I could not find any similar previous Python Ideas proposals by searching for “__equiv__” or “equivalence”, and internet searches for “python equivalence operator” (and variations) only turned up discussions about equality versus identity. Of course, I would love to know if I missed something!

2 Likes

I like this!

The only thing I’m not sure about is the __equiv__ name, as it’s a bit to close to __equal__ for my taste. Perhaps it could be worth to consider an alternative, such as __sim__, inspired by \sim (\sim) from \TeX.

Will it be commutative like ==, or are you also proposing “reflected” __requiv__/__rsim__ and “augmented” __iequiv__/__isim__ methods?

Also, will there be any restrictions on the return type, like for example __contains__ (which must return bool), or will it be more like __gt__ (which can return anything, e.g. np.bool in numpy)?

2 Likes

This seems to me like it’s too niche to warrant a dedicated operator. Especially one that won’t be used in the stdlib, but is only for 3rd party use.

For a similar proposal, look at PEP 465 which added the @ (matrix multiplication) operator. It’s a very similar idea, but unlike this proposal, matrix multiplication is an extremely common operation in scientific programming, and it had an immediate use in a very major library (numpy). Even with that, it took many years to get accepted.

If you want to proceed with this proposal, I strongly suggest you research the history of PEP 465, and in particular understand the hurdles it had to overcome, and how it managed to do so.

23 Likes

I don’t think it will be useful and it can even be an issue if you want multiple equivalence relations. More generally, x \sim y can also be regarded as a shorthand for x \equiv y \pmod{\mathcal{R}}, that is, x and y are equal under the relation \mathcal{R}. Since there is only one __sim__, I can only define it for a single \mathcal{R} so I think it would be better to have a separate function instead of a dunder method. The same could be said for __lt__ & co, but in general, the objects are ordered according to a chosen order (and on-the-fly ordering is usually addressed via sorted(..., key=...). OTOH, if I want to represent integers modulo n, I’d need to have separate classes for each of them in order to specialize int.__sim__.

Will it be commutative like ==, or are you also proposing “reflected” __requiv__/__rsim__ and “augmented” __iequiv__/__isim__ methods?

Equivalent relations are, by definition, reflexive, symmetric and transitive (equality is an equivalence relation). That is, a \sim a, a \sim b \iff b \sim a and a \sim b \wedge b \sim c \Rightarrow a \sim c. In addition, I don’t see what augmented would mean in this case as an equivalence relation is simply a relation so there is no arithmetic involved at the end (it’s the same as < for instance). The return type could be anything but it should be interpreted in a boolean context.


two numbers are approximately equal (e.g., a∌b to mean a≈b)

FTR, this relation is not transitive, and thus is not an equivalence relation. Indeed, consider a sequence (a_i)_{i\geq0} with a_i = i\varepsilon for a small constant \varepsilon>0. Then, each term is approximately equal to each other but a_0 is very different from a_n with n \geq 1/\varepsilon.

11 Likes

The ~ is already used for something else.

The confusion might be with the expectation that a~b returns something of the same type as ~b, or as a~(~b). Expectation suggested by a-b, -b and a-(-b) returning the same type in many cases.

2 Likes

This is a very good point that I had not considered.

Yup. An abuse of notation all the way.

That’s a fair point. The similar method names would be potentially confusing. The correspondence between method name and \TeX is worsened by the fact that \equiv produces \equiv. I wouldn’t mind the idea of __sim__.

I defer to @picnixz’s reply on this.

Thanks for the suggestion! I agree that my proposed operator is most similar in spirit to @ and __matmul__ and siblings.

I’m not so sure. The need for a different equivalence relation recurring e.g. in the context of computer algebra.

The Mathematica has two such operators: == and ===. In SymPy/Diofant, the equality operator (==) is used to test structural equivalence (same expression trees), while the equals() method — for mathematical equivalence:

>>> e1 = (x + 1)**2
>>> e2 = x**2 + 2*x + 1
>>> e1 == e2
False
>>> e1.equals(e2)
True

Maybe less surprising (this mentioned e.g. in the Gotchas section of the Diofant tutorial, ditto for SymPy) would be using === (BTW, @myoung-space-science, this is my proposal for infix operator symbol) in the first case and == for the later. This is an example outside of the stdlib.

Same story, but in the stdlib. How to compare different floating-point numbers if they are same in sense of having same internal representation? We have:

>>> from math import nan
>>> nan != nan
True
>>> 0.0 == -0.0
True

Current workaround is 1) make separate checks for nan’s, 2) use math.copysign() to check signs of zeros.

3 Likes

By actually comparing their internal representations when that’s what you care about:

>>> (0.0).hex() == (-0.0).hex()
False
>>> math.nan.hex() == math.nan.hex()
True
5 Likes

With a bit more care when your float come from external sources.

import struct

x = struct.unpack('f', b'\x00\x00\xc0\x7f')[0]
y = struct.unpack('f', b'\x01\x00\xc0\x7f')[0]

assert(x.hex() == y.hex())

assert(struct.pack('f', x) != struct.pack('f', y))
2 Likes

Do you mean using === where I’m proposing to use ~? I could certainly see the value of === because it is not already in use (whereas ~ exists as a unary operator). My concern is that === seems like it should imply a stronger constraint than ==.

Equivalent equations, like those in the Diofant docs, are another example of what I’m hoping to address with an equivalence operator.

1 Like

I agree - especially when JavaScript already uses the === for exactly this (a stronger equality operator).

Personally, I don’t think that a ~ b is unclear at all; surely anyone who knows that the binary inverse operator even exists should realise it’s a unary operator and isn’t relevant here.

Yes.

BTW, another use case for === in the stdlib — take into account types for equality (like JS). E.g. 1.0 == 1 is True, but 1.0 === 1 will be False.

I think that this case worth a dedicated operator. While need for “just yet another equivalence operator” seems not obvious. Any language, that has something like this?

Please, no. There’s nothing wrong with type(a) == type(b) and a == b. We really don’t want to end up with the mess that Javascript developers have to deal with


6 Likes

Except it doesn’t work for nans or for numpy arrays.

It would be very useful to have a version of equality checking that checks for precisely equal representations and is not overloaded to have any other meaning. Too late to change now but ideally that would be the version of equality that is used for hashing, dicts, etc.

My thought, too, though I admit that I hadn’t considered @franklinvp’s point about a user expecting a ~ b to return something of the same type as ~a based on the fact that a - b returns something of the same type as -a.

I really don’t want to start having to explain to people when to use == vs ===. It’s bad enough explaining == vs is.

I’m not seeing anything in here to make me think that there’s anything wrong with classes that have some other potential meaning(s – yes plural) of equivalence, have their own method for it (and/or in some cases, define a normalisation method whose output can be used as dict keys). I can’t imagine any meaningful duck typing/interoperability so it wouldn’t even matter equivalency methods were all given mismatching names.

5 Likes

How about using ~= instead? Reads intuitively like ≅ : APPROXIMATELY EQUAL TO (U+2245)

Unlike a plain ~, the equal sign in the operator helps make it clear that a Boolean value is expected from the operation.

1 Like

The proposal is about an operator that, like @, is meant for domain-specific use. People in the specific domains are going to intuitively understand what it means because of the similarity to the notation they’re already using for the operation outside of programming. It isn’t going to be like === and is, which are meant for general cross-type comparisons.

1 Like

I like the idea of ~=, but I would still want it to be read as “equivalent to” or even “similar to”, rather than as “approximately to.”

I can definitely see that. It also leaves room for still using ~ as an infix operator down the road.

What constitutes “precisely equal representations” though? Consider:

x, y = [], []
x.extend(range(4))
y.extend(range(5)); y.pop()
assert x == y
assert sys.getsizeof(x) != sys.getsizeof(y)

In all meaningful ways, these two lists have the same contents. Yet they take up different amounts of space, which means they can’t possibly have precisely equal representations of that same contents.

I don’t think there’s a lot of point having an operator that looks at in-memory representations. That’s the job of ctypes or something, not normal code.

2 Likes