I posted recently about adding a map_binary
function which could be used in combination with operator.eq
.
It seemed that there were mixed reactions to this and most people didn’t seem to see what the point of it was, and that it might be better to just provide an implementation of all_equal
directly.
That does make a lot of sense, and I have done some initial work on an implementation for this. Initial testing shows that a C implementation of all_equal
runs in a bit over half the time compared to the fastest known Python implementation, which is based on a short circuiting version of itertools.groupby
.
There is a Python implementation of an all_equal
function in more-itertools
. The fact that this exists already indicates there are developers out there who are already making use of such a function.
The problem with it is that it is comparably slow compared to a C implementation. This is expected, because the existing Python implementation in more-itertools
chains together several Python function calls. The current implementation also does not make use of short-circuit return logic, which means it would be expected to perform poorly on large input. There is an alternative short-circuiting version out there too, but I don’t think that version is being used by the more-itertools
module. (Please do correct me if my research into this is wrong.)
If you do an internet search for “python test if all elements are equal” or “python all equal” you will find numerous forum threads where people are asking for a way to implement an all_equal
function, because Python doesn’t currently provide one either as part of the core Python install.
There are a very broad range of possible implementations which show up, many of which only work under certain conditions. (For example, leveraging numpy.) It is also not obvious which of these solutions provide generally good performance on a wide range of input (which is important to know). Some solutions are more readable but perform surprisingly poorly. Other implementations are cryptic and are not at all obvious how they work from a first inspection.
None of the above is good, because we are asking developers to not only spend time searching for a solution, but to inspect many possible implementations and somehow pick a “good” one. This is not easy to do.
In my opinion we should really just provide the right tool for the job from the start, then future users will not spend large amounts of their time trying to discern how to get this tool into their codebase.
It would be good to get some further feedback on this.
- If you agree, please just comment “I agree” or otherwise indicate approval with an upvote
- If you do not agree, please just comment “I do not agree”
- or better yet, explain your objections, (presumably if you do not agree you have some reason to object to the proposal)
I don’t expect this to be a particularly interesting proposal to most people. If you happen to have worked on a project before where you would have found it useful to have an all_equal
function, then you probably agree it would be good to add one.
If you didn’t work on such a project, I would assume you are indifferent.
I can’t think of any disadvantages to doing so.