Minmax function alongside min and max

@steven.daprano: Ok, let me show you an example:

>>> max((float("nan"), 0))
nan
>>> max((0, float("nan")))
0

It’s the same sequence, the only difference is the position of their items. This is a bug. Point.

@ruud Mh, I don’t know if your change will work always. Essentially you’re testing equality, since vmax can’t be lesser than itself. But there are objects that does not support __eq__, like numpy.ndarray.

The only good reliable test IMHO is to test that > and <= returns both False.

Not true. This is the precise definition of “undefined behaviour”.

2 Likes

It’s not poop… It’s “made by the dog” :smiley:

Seriously, you can’t have an undefined behavior only because the position of the elements in the iterable changes.

You can have an undefined behavior only if all the object in the iterable have an undefined behavior with the comparison operators.

Otherwise, I suppose IMHO that this is a… bug.

No, that’s not how it works. We can just say that the output of any operation with NaN is undefined and get away with it. CPython is the reference implementation and we make the rules. You could argue that it’s a stupid rule, but that’s it.

This is not the case, since if NaN is not the first object, the operation is not undefined, it gives you the correct answer. If the operation will give you an undefined behavior with NaN, or any other object like it, at any position, I would agree with you. But this is not the case.

The fact is: what is expected by the coder? For what I know, Python empathizes simplicity and readability, and that the code make the things “like a human expects”. This is not the case.

Furthermore, I think that if Python publicly admit in its documentation that max() and min()have an “undefined behavior” only if the first elements of the iterable does not have well-defined behavior of comparison operators, I think that any developers in the world will laugh :smiley:

Marco: please could you keep the discussion respectful?

3 Likes

About sorted() and list.sort() and any other function like them, I think the current behavior is correct. Indeed the result is completely undefined if the NaN changes its position, it’s not only a problem if it’s at the start.

Maybe Python could implement total ordering by default for numbers? The problem is total ordering requires that NaNs carry on a payload. I suppose that Python does not store this information, so it could be really a problem to implement it, and I don’t know how much is useful in real world.

IMHO min() and max() should ignore unorderable elements, and sorted() and list.sort() raise a warning.

For sort and sorted, there’s some relevant recent discussion on the tracker (see issue 36095). The root cause of these issues really has nothing to do with sort, sorted, min or max; rather, it stems from the way comparisons on NaNs work. Changing those comparisons to implement a total order (or at least a total preorder), or to raise on comparisons involving NaNs, seems reasonable on the face of it. But that would involve significant breakage of existing code. It’s yet another of those situations where the answer to the question “Should we have done this differently?” is “Possibly, yes.”. But that’s not helpful, because the question actually facing us is “Should we change it now?”, and that’s a very different proposition.

1 Like

@mdickinson I don’t think this change will break nothing:

  1. raising a warning let the code run. Simply the coder will be informed of the problem. “Errors should never pass silently.”
  2. ignoring first NaNs in max and min could fix some code, instead :smiley: As I already said, probably the coders that encountered the problem have removed the NaNs from the iterable, or used numpy. Who never fixed it, it’s quite improbable that the code after max and min really expects NaN. So Python probably will fix that codes silently.

Anyway, there’s an alternative.

Python could implement total ordering only partially:

  1. +NaN > number == True
  2. number > -NaN == True
  3. +NaN >= +NaN == False
  4. +NaN <= +NaN == False
  5. -NaN >= -NaN == False
  6. -NaN <= -NaN == False

Pro: sorting will be no more undefined if NaN is present
Contro: max will return NaN if present, at any position. min could return -NaN, but they are very rare.

In these cases, I think max and min could raise a warning.

Anyway, in this case, this is a real change in Python, and should be handled by a PEP. That I have no time to open :smiley:

@mdickinson: anyway, if you want my opinion, for what it’s worth, my check is more simple and works for any object that does not support ordering, not only for NaNs.

And IMHO it should be the default for max, min, sorted, list.sort, heap, bisect, and all the other folks because, as I said, it’s quite improbable that this change will break something, but, on the contrary, it will silently fix many old codes.

If someone wants total ordering, a math.total_ordering can be implemented and passed as key for sorting.

My 2 cents.

@Marco_Sulla not only would this break the IEEE-574 NaN specification, this entire topic has nothing to do with adding minmax to Python. I respectfully ask you to open another thread if you want to continue the discussion on how NaNs should be handled in Python.

3 Likes

I forked the discussion here: https://discuss.python.org/t/2868

@ruud Anyway, if you want to support also default, I think that, before you return it, you should transform it to a tuple and check if its length is 2. Otherwise, an error should be raised.

Obviously not as evident as you presume, from the reactions to ypur example.

Given the “two different placements of NaN” example, I would think its the fault of trying to find the minimum of results containing NaN’s. I would be at fault for not accounting for NaN’s. Ignoring them, or any other action could depend on circumstance - wh’s to say that an arbitrary resolution is th one needed?

It might be better still to raise an exception unless an option is given to state how NaN’s should be treated - at least they won’t silently be ignored.

@Paddy3118 Please discuss this here:
https://discuss.python.org/t/2868

Marco, point us at official Python documentation that documents what min
and max are supposed to do when passed values that don’t provide a total
order, and if that documented behaviour is different from what the
functions actually do, then we will conceed that it is a bug.

There are at least three different things that min(0, NAN) could do:

  • propogate the NAN (return a NAN)
  • ignore the NAN (return 0)
  • raise an exception

and no consensus on what it should do. At least half our users will
consider your version to be “buggy” since it doesn’t match their
expectations. You don’t get to unilaterally decide what is correct and
what is buggy.

If you want to change this, you will need to write a PEP. You should
consider the two different versions of min/max that the IEEE-754
standard specifies, plus any other versions that others may desire
(such as a version which raises an exception).

It isn’t enough to specify the behaviour with NANs, there are an
infinite number of ways a data set can fail to provide a total order.
For instance, dominance heirarchies are often not a total order, for
example Rock Paper Scissors:

Rock > Scissors
Scissors > Paper
Paper > Rock

max(Rock, Paper, Scissors) should do what?

Unless you are prepared to write a PEP, there’s no point in you
continuing this argument. There is no agreement on what the “correct”
behaviour is, so any version we provide will surprise some people. The
best we can do is explicitly document that the behaviour for data sets
that don’t define a total order will be implementation-defined and
therefore we make no promise about what will happen.

(The fact that min and max currently ignore NANs that aren’t in the
first position is an accident, not an intentional behaviour.)

Paul: “undefined behaviour” has special meaning to C programmers which
it doesn’t have to other language programmers. Given the special place C
has in the programming ecosystem, I prefer to honour their definition,
and use “implementation-defined behaviour” for what we are talking about
here.

@steven.daprano *sigh* please discuss it here… and please read first the posts:
https://discuss.python.org/t/2868

Thanks, that’s a good point. I do like the connotations of C’s “undefined behaviour” (in general, but in this discussion in particular) but if it’s not a familiar idea to people in general, it’s probably just confusing the discussion. I’ll keep this in mind.