Terry already linked to listsort.txt
, which was intended to be my last word on the topic
.
As it says, the disadvantage of merge sorts is that they do much more data movement than members of the quicksort family (of which pdqsort is one). When comparisons are cheap (as for native machine ints or floats on modern boxes), saving moving a memory word may be as valuable as saving a comparison. Quicksort variants shine then - quicksort is stellar at reducing memory movement.
But, in Python, and especially at the time the current sort was written, comparisons are enormously more expensive than moving an object pointer in memory. That’s where merge sorts shine: they get very much closer to doing a minimal number of compares than quicksorts get. See listsort.txt
for brief quantification of that. But they do a lot more data movement.
There is no reason in general that stable vs non-stable is important to speed. That’s not really the issue. But it can be, in some cases. For example, consider this list:
xs = [2] + [1] * 1_000_000_000
That is, 2
followed by a billion 1
s. A sufficiently clever non-stable sort could just swap the endpoints to get the list in sorted order.
But that violates stability (the 1
at the end moves to the start). A stable sort cannot do better than moving all billion plus one elements, rotating the entire list one place to the left.
I still pay attention to developments in sorting, and I’ve seen nothing yet that would plausibly be an improvement over Python’s general-purpose sort for CPython’s needs.
For specific data types, sure. There’s no end of ways to exploit regularities. But the only thing Python’s sort requires of objects is that they implement __lt__()
. It makes no assumptions of any kind about how the objects are stored, are represented, or what they “mean”.
If you’re always sorting lists of floats, by all means, use numpy
’s sort instead.
Or if you’re always sorting lists of strings, there are faster ways to sort those too (although, offhand, I don’t know of an extension module geared to that specific case).
The builtin sort works for any types that implement a sane __lt__()
, and strives to invoke that method as seldom as possible for common, real-life cases.
I have no doubt that pdqsort can be faster for some specific data types, but also no doubt that it wouldn’t be a suitable replacement for CPython’s current sort (because it’s not stable, and because quicksort variants do more comparisons, which are still significantly slower than pointer copies).