.sort() is only stable-sort if key is specified

shantanuojha · October 7, 2023, 6:52pm

So I was solving a problem and I found out sort is not stable-sort if key is not provided it will sort according to the second element if key is not provided

but according to documentation its always stable sort:

The `sort()` method is guaranteed to be stable. A sort is stable if it guarantees not to change the relative order of elements that compare equal — this is helpful for sorting in multiple passes (for example, sort by department, then by salary grade).

code:

events = [(2, -1), (2, 1), (3, -1), (3, 1), (3, -1)]
events.sort()
print(events)

data = [('red', 1), ('blue', 2), ('red', 2), ('blue', 1)]
data.sort()
print(data)

data = [('red', 1), ('blue', 2), ('red', 2), ('blue', 1)]
data.sort(key=lambda x: x[0])
print(data)

output:

[(2, -1), (2, 1), (3, -1), (3, -1), (3, 1)]
[('blue', 1), ('blue', 2), ('red', 1), ('red', 2)]
[('blue', 2), ('blue', 1), ('red', 1), ('red', 2)]

so it will be stable sort only if i provied a key i didn’t know that. it’s not even mentioned in the documentation. It’s a feature if everyone knows that but it a bug for now

My Python version is:
Python 3.11.4

Rosuav · October 7, 2023, 7:15pm

That’s not actually what “stable” means here. A stable sort means that the elements will remain in their relative positions if they compare equal to each other. As an extreme example:

>>> data = [('red', 1), ('blue', 2), ('red', 2), ('blue', 1)]
>>> data.sort(key=lambda x: 42)
>>> data
[('red', 1), ('blue', 2), ('red', 2), ('blue', 1)]

With all the sort keys identical, nothing is reordered. But if no sort key function is provided, the comparisons between the elements themselves are used:

>>> ('blue', 2) < ('blue', 1)
False
>>> ('red', 1) < ('red', 2)
True

This is not a bug, and in fact, in languages that DON’T behave this way, it’s extremely annoying (JavaScript, I’m looking at you!), as I have to define an “obvious” key function to make sorting behave sanely.

kyle · October 7, 2023, 7:21pm

That’s what I thought by introducing the key it makes it explicit like not to check the second or the nth item in the tuple

shantanuojha · October 7, 2023, 7:44pm

thanks for explaining but many websites say that it only compares the first element that’s where i got this doubt from.

Rosuav · October 7, 2023, 7:46pm

Yes, exactly. If you don’t provide a key, the entire tuple is used for comparison. If you provide a key, only what’s in the key is used.

Then those web sites are wrong Can you link to them? Maybe they can be edited.

storchaka · October 7, 2023, 7:52pm

>>> data = [('red', 1), ('blue', 1.0), ('red', 1.0), ('blue', 1)]
>>> data.sort()
>>> data
[('blue', 1.0), ('blue', 1), ('red', 1), ('red', 1.0)]

The order of elements was changed (all blues are now before all reds), but the relative order of equal items was preserved.

tjreedy · October 7, 2023, 9:24pm

Stable sort means that items that compare equal in the sort maintain their same relative order. To verify, one must check item ids before and after the sort. (The id could either be id(item) or an identifying item attribute not used in the sort.) In the following,

>>> data = [['red', 1], ['blue', 1], ['red', 1], ['blue', 1]]
>>> [id(d) for d in data]
[2350059152064, 2350059152192, 2350016025280, 2350059152384]
>>> data.sort()
>>> data
[['blue', 1], ['blue', 1], ['red', 1], ['red', 1]]
>>> [id(d) for d in data]
[2350059152192, 2350059152384, 2350059152064, 2350016025280]

the two blue items and the two red items are in the same order before and after the sort. For a non-stable sort, the last line could have been [2350059152384, 2350059152192, 2350059152064, 2350016025280] with the two blue items reversed.

kyle · October 8, 2023, 6:30am

nice… makes much more sense on a deeper level

.sort() is only stable-sort if key is specified

The sort() method is guaranteed to be stable. A sort is stable if it guarantees not to change the relative order of elements that compare equal — this is helpful for sorting in multiple passes (for example, sort by department, then by salary grade).

code:

output:

The `sort()` method is guaranteed to be stable. A sort is stable if it guarantees not to change the relative order of elements that compare equal — this is helpful for sorting in multiple passes (for example, sort by department, then by salary grade).