Allow `sorted` to take list for `key` argument

So wrap the two parts:

perm = generate_permutation(keys)
sorted_list = apply_permutation(perm, lst)

I’m not sure how much performance you can squeeze out of writing {generate,apply}_permutation in C[1], but my point is that I’d rather have the clarity of explicitly stating the two operations over any performance benefit of merging them. But I concede that I don’t have a need for really high-performance sorting - maybe there’s a use case I’m not seeing.


  1. or by clever tricks in Python ↩︎

Very insightful :slight_smile:

This is the neatest solution for the coupled-sequences sorting I could witness.

It could be

X.sort(keys=Y)

And it also provides a neat way right now for the argsort

sorted(range(3), key=partial(next, iter([1,0,2])))  # [1, 0, 2]

Which could be

sorted(range(3), keys=[1,0,2])

To push things further, there could be a special case :

idxs = sorted(range, values)  # Special case : automatically infer range length from values

then sorting multiple lists would be as simple as

values2.sort(keys=idxs)
values3.sort(keys=idxs)

This fills all the cases : inplace sorting, argsort, coupled zipsort.

If someone needs these often then this can be easily done. Even now.
But it is out of scope of this specific proposal.

This proposal just exposes a bit of additional API which cuts through avoiding unnecessary operations that slow things down or feel hacky.
Which is convenient for some cases as it is.
And also provides more performant path to implement more complex ones (or the above).

And API should just be tailored to do the above neatly while properly covering the intent.

If we were starting from scratch then above could be considered as primary, but now things are quite settled - list.sort API.
And if tailoring 2 functions for 2 steps is desirable, then I think it should either be different proposal or done by the user.
In this concrete situation it is one level higher.

With this one can implement:

def generate_permutation(keys):
    return sorted(range(len(keys)), keylist=keys)

and be sure of top performance.

And 2nd part has variations on what sort of result one wants to get back.
And depending on it there are different solutions - I don’t even know which one is “one best way” for many of them.

Also, it matters how many times apply_permutation is going to be used.
If more than once, then making a wrapper might be sub-optimal.
E.g. for tuple result:

reorderer = operator.itemgetter(*indices)

Will most likely be optimal solution.


Also, when there are only 2 sequences - 1 which is sorted and another which is reordered accordingly - doing it in 1 step:

values.sort(keylist=keys)

is able to enjoy performance which is not possible to achieve via:

perm = generate_permutation(keys)
keys = apply_permutation(perm, keys)
values = apply_permutation(perm, values)

So again, this is a lower level building block:

  1. If you need top performance - it is exposed.
  2. If you need convenient solution tailored to your case - you can make wrappers that will also enjoy performance gain from this.

It relies on the key function being called for each element of the list to be sorted, in order. But that isn’t guaranteed (although it’s hard to imagine an implementation that wouldn’t work that way) - all that’s guaranteed is that the key for each item will be calculated once.

4 Likes

I agree, argsort is very useful in numpy. I found 5 uses in one of my code, and the sorted index was used for up to 3 additional arrays.
For example, here I need to get both the frequency and amplitude of at most n_max higher peaks in a signal spectrum:

index_sorted_by_ampl = np.argsort(np.abs(peaks_ampl))
if len(index_sorted_by_ampl) > n_max:
    index_sorted_by_ampl = index_sorted_by_ampl[:n_max]
peaks_freq = peaks_freq[index_sorted_by_ampl]
peaks_ampl = peaks_db[index_sorted_by_ampl]

It helps that numpy has a syntax for indexing with an array that python list doesn’t have, so that I can write peaks_freq[index_sorted_by_ampl] instead of peaks_freq[i for i in index_sorted_by_ampl].

2 Likes

Having worked in a language that has the exact feature of “sort all of these into this order”, I can say that it’s not really something that comes up very often.

That said, though, I think one plausible answer here could be to maintain the mutate/copy distinction by having sorted(stuff, keys=otherstuff) copy both stuff and otherstuff before sorting.

Yeah that’s the main reason why I called it a hack. (The other being the abuse of next’s default parameter.)

1 Like

(Side note : the len check can be skipped because numpy does not raise index errors on slicing)

This is a standard basis for sorting operations (on my side at least).
Here, a permutation is created and then calling __getitem__ with the permutation actually applies the permutation, which python does not allow natively.

This is the concept catched by this proposition

which allows maximum flexibility…
But the following is also true regarding performance

because applying the permutation will be a get operation on every indices…

→ The idea which is clear from numpy usage is that permutations are the core of flexibility of the operations.
→ On the pure python side it is clear that applying permutations sequentially is not the optimal way.

Things will need to be pushed further if you actually want to have the full argsort power akin to numpy in pure python… But thinking further a permutation has some properties that could make it more optimizable than a list of integers (but less than a range and even lesser than a slice), this is where we fall on

which is OT here, and might be an apply_permutation builtin function, or, bolder, a new builtin permutation, usable like a slice, fully optimized in creation, memory, and application.


If I wish to remain On-Topic, all I have left to say is that

keys, values = sorted(keys), sorted(values, keys=keys)

or

idxs = sorted(range(len(keys)), keys=keys)
values1.sort(keys=idxs)
values2.sort(keys=idxs)
keys.sort()

(while suboptimal), will absolutely make sorting syntax more comfortable.


1 Like

About sorting “more than one” list the same way, there’s a nearly obvious use case that, for what will turn out to be obvious reasons, rarely applies.

In Python we typically represent 2D data as a list of rows (where each row is a list of column entries). And it’s common to want to sort the rows by a specific column. That’s easily done now via

key=operator.itemgetter(column_index)

But what if your data is actually a list of columns (where each column is a list of row entries)? Then it’s “a problem”. It’s trivial then to sort the column of interest, but a puzzle how to permute all the other columns the same way.

That’s why zip shows up so frequently in “hard cases”: for a 2D list of lists representation, zip(*the_list) transposes the data, switching between “list of rows” and “list of columns” formats.

But we very rarely work with “list of columns” representations to begin with.

2 Likes

Yes. “Practicality beats purity”, and it’s also of value that .sort() and sorted() have the same signature. The AI was right to point out the possibility for confusion on this point, but it’s up to us to judge the tradeoffs :smile:.

It’s “hyper-generalization” to me, YAGNI complication to no particularly good end, a tail threatening to strangle the dog :wink:.

100%!

OK, I did. But reluctantly. In fact I find the Pike approach is more elegant and natural. It’s context here that matters, because Python took a different direction decades ago, and can’t be said to be in any sense “broken”. The API focused from the start on sorting a single list. That passing a fixed keylist would sort “two for one” is a happy consequence of the obvious implementation, but not the primary point of the change.

Thank you :slight_smile: I like getting a bit of stats. (even if that might not be the primary indicator)

Ok, let’s look at the current situation…

Setup
N = 100_000

import random
from operator import itemgetter
keys = random.sample(range(N), N)
columns = [list(range(N)) for _ in range(100)]

def sort_zip(ncols):
    if ncols == 1:
        col = columns[0]
        pairs = sorted(zip(keys, col), key=itemgetter(0))
        new_keys, new_col = map(list, zip(*pairs))
    else:
        keys_rows = zip(keys, *columns)
        new_keys_rows = sorted(keys_rows, key=itemgetter(0))
        new_keys, *new_cols = map(list, zip(*new_keys_rows))

def sort_new_or_zip(ncols):
    new_keys = list(keys)
    if ncols == 1:
        new_col = sorted(columns[0], key=new_keys)
    else:
        new_rows = sorted(zip(*columns), key=new_keys)
        new_cols = list(map(list, zip(*new_rows)))


def sort_itemgetter(ncols):
    idxs = sorted(range(N), key=keys.__getitem__)
    reorderer = itemgetter(*idxs)
    new_keys = list(reorderer(keys))
    new_cols = list(map(list, map(reorderer, columns[:ncols])))


def sort_itemgetter_new(ncols):
    new_keys = list(keys)
    idxs = sorted(range(N), key=new_keys)
    reorderer = itemgetter(*idxs)
    new_cols = list(map(list, map(reorderer, columns[:ncols])))
./python.exe -m timeit -s "$S" "sort_zip(30)"
./python.exe -m timeit -s "$S" "sort_new_or_zip(30)"
./python.exe -m timeit -s "$S" "sort_itemgetter(30)"
./python.exe -m timeit -s "$S" "sort_itemgetter_new(30)"

#         ncols       |  1 |  2  |   3 |   30 |
# ------------------- |----|-----|-----|------|
# sort_zip            | 70 | 150 | 150 | 3600 |
# sort_new_or_zip     | 20 | 130 | 150 | 3600 |
# sort_itemgetter     | 50 |  65 |  75 |  435 |
# sort_itemgetter_new | 35 |  50 |  60 |  420 |
'

So:
1a. if key list does not need to be mutated, this provides performance benefit regardless of whether keylist gets mutated or not. (Clearly visible in benchmarks of the main thread)
1b. Mutating keylist provides very fast path to sort 1 key list and reorder 1 value list (e.g. dict sorting).
2. If there is more than 1 value list, then operator.itemgetter is the way. this provides performance benefit, which decreases as number of value lists grows (as more and more time is spent on reordering while sorting step stays constant).



How would code look like?

Scenario 1. keylist IS mutated

(1a) sort values, but do NOT sort keys

sorted(values, keylist=list(keys))

(1b) sort both values and keys

sorted(values, keylist=keys)

(2a) sort many values, but DO NOT sort keys

idxs = sorted(range(N), keylist=list(keys))
# reorder values with itemgetter

(2b) sort many values and keys as well

idxs = sorted(range(N), keylist=keys)
# reorder values with itemgetter

Scenario 2. keylist is NOT mutated

(1a) sort values, but DO NOT sort keys

sorted(values, keylist=keys)

(1b) sort both values and keys

# if `keylist` is NOT mutated, this is not available. Use slower (2b)

(2a) sort many values, but DO NOT sort keys

idxs = sorted(range(N), keylist=keys)
# reorder values with itemgetter

(2b) sort many values and keys as well

idxs = sorted(range(N), keylist=keys)   # if `keylist` is NOT mutated
# reorder keys with itemgetter
# reorder values with itemgetter

Tradeoffs:

  1. Scenario 1 mutates the keylist - without good documentation some might trip too often
  2. Scenario 1 has 1 extra path for maximum performance in case both 1 keys and 1 values need to be sorted based on keys.
  3. Aggregate verbosity is approximately the same for both cases (scenario 2 does not have (1b), but the (2b) is more verbose)

Path Forward

If non-mutating keylist is preferred, then problem is solved - there is not much to discuss.

However, I think dict sorting (1b case) is quite fundamental and 2.5x faster result compared to 1.4x in case of non-mutation is quite compelling. Also, mutation provides extra performance for (2b), which can be felt for small number of value lists.

Thus, if performance benefits justify keylist mutation, then I see several paths:

  1. Just do mutating keylist in list.sort(keylist=keys) / sorted(values, keylist=keys) and focus on documentation and education
  2. Mutate for values.sort(keylist=keys), but do NOT mutate sorted(values, keylist=keys). In a certain way it could be less confusing to remember that list.sort mutates everything and sorted mutates nothing. However, this brings inconsistency in another dimension - different argument treatment for 2 different functions, which still confuses people, just in different way.
  3. Let a bit of Pike into Python and do keys.sort(parallel=values) and sorted(keys, parallel=values). This way, it is perfectly intuitive that list which is being sorted in parallel is being mutated. In terms of performance, there is no loss here.
1 Like

This is confused by that sorted(parallel=) doesn’t even name two lists - and it’s muddy enough to me that I’m not sure what you intended to type instead.

The word “parallel” on its own doesn’t sit all that well with me either. I know exactly at first sight what an option name containing the substring “key” probably intends to mean. What could “parallel” possibly mean? “Concurrent processing” of some kind is my first guess. “Oh, it’s intended to be the keys for sorting list” “OK, then why not call it something containing the substring ‘key’?” :wink:

I don’t really care whether sorted() does or doesn’t mutate its keylist argument, but my initial use case for that would run faster if it does mutate. I have in mind getting a list of a dict’s keys sorted by the dict’s values.

sorted(d, keylist=list(d.values()))

does the trick either way, but is faster if sorted() doesn’t have to go on to make a (redundant!) copy of the keylist just to avoid mutating the input.

Either way it’s bound to be faster than the current:

sorted(d, key=d.__getitem__)

because no dict lookups would be needed anymore.

You could say that my preferred spelling is a case for making keylist accept an iterable argument, but I wouldn’t :wink:. I have no problem with making an explicit list myself, and am drawing a blank when trying to contrive a plausible case where it would matter if the keylist got mutated.

2 Likes

Added keys and values in their appropriate places - hope it is clearer now.

parallel is “list to reorder in parallel”

The problem gets flipped a bit:

# From:
sorted(d, keylist=list(d.values()))

# To:
dict_keys_to_sort = list(d)
sorted(d.values(), parallel=dict_keys_to_sort)

As of now, I am in 2 minds about this.

One thing I am sure about is that I am strongly leaning towards doing mutating variant.
Otherwise it feels like unfinished business.

1 Like

While I’m solidly of one mind as of now. sorted(A) returns a permuted version of A, but sorted(A, parallel=B) returns a permuted version of B?! That’s … well, there’s no nice way to say it :wink:

sorted(A, keylist=B) returns a permuted version of A, and so does sorted(A). An optional argument shouldn’t fundamentally change what a function returns, just a variant of what it returns.

One thing I am sure about is that I am strongly leaning towards doing mutating variant.

If someone presented a plausible use case where mutating the key list would hurt, I missed it. Certainly one can be contrived, hence why I said “plausible” instead.

As in my dict example, the plausible sorted() use cases I have pass a “throwaway” key list, one that isn’t even reachable after the call to sorted() returns.

I am with you by now.
parallel is too far from how things currently are.

There are some, but they are not common.
One case that might seem to occur often is:

sorted(values1, keylist=keys)
sorted(values2, keylist=keys)
sorted(values3, keylist=keys)
...

But it is easily refuted as: “Why would one call 3 x O(nlogn), when O(nlogn) + 2 x O(n) is on the table?”

I don’t think there is more to it than:

keys = [3, 2, 1]
values = ['a', 'b', 'c']
sorted(values, keylist=keys)
# For some reason I need original keys now
# Maybe debugging
# Maybe keys is part of `json`, which I am modifying and will save in the end, but keys should remain unmodified

Personally, I think remembering mutation is manageable.
I think at least big majority of people will read docs at least once and if it is properly emphasised, it should not cause issues.

However, if there are some real life examples where this would cause issues, could think how to address this.
Personally, I am not a big fan of either of those (but I like neat APIs so not surprising), but:

  1. sorted(values, keylist=None, mutate_keylist=False) - big bloat
  2. sorted(values, mutable_keylist=None) - smaller bloat, but still still an odd one out given current API.

Might seem to. But I don’t think it does.

2 Likes

This is better than I thought.

import random as rand
from functools import partial
from operator import itemgetter

N = 100_000
keys = rand.sample(range(N), N)
values = rand.sample(range(N), N)

def sort1():
    return [c[1] for c in sorted(zip(keys, values), key=itemgetter(0))]

def sort2():
    return [values[i] for i in sorted(range(N), key=keys.__getitem__)]

def sort3():
    return sorted(values, key=partial(next, iter(keys)))

def sort4():
    return sorted(values, key=list(keys))
./python.exe -m timeit -s "$S" "sort1()"    # 57 ms
./python.exe -m timeit -s "$S" "sort2()"    # 43 ms
./python.exe -m timeit -s "$S" "sort3()"    # 29 ms
./python.exe -m timeit -s "$S" "sort4()"    # 22 ms
2 Likes

Bingo. If Python’s sort API had been designed from the start to take one or more positional arguments, then it would be natural to sort only the first directly and permute the rest in lockstep. But it’s too late for that now.

In numpy that’s naturally approached via argsort(). It’s more attractive there too because numpy arrays also support “gather” indexing: an array can be indexed by a list (or array) of indices, and numpy will “gather” the array elements from those “scattered” indices. Python lists don’t support that.

It may not be well known, though, that Python’s itemgetter does support that, if you squint a bit. Saves the need for an explicit listcomp (as in my earlier example).

>>> import operator
>>> ixs = [2, 1, 0] * 3
>>> operator.itemgetter(*ixs)("abc")
('c', 'b', 'a', 'c', 'b', 'a', 'c', 'b', 'a')

Would probably help to improve the docs for all this.

I don’t understand how the second would work.

If consensus is that a keylist must not be mutated by default, then I’d rather see optional argument names that don’t mention any variation of the word “mutation”. Too elitist :wink:. I would rather appeal to operational intuitions, by using variations of “copy”. Like copy_keylist=True for a default. That all but screams that the keylist won’t be altered in any way, while “else don’t copy” all but screams that anything may happen to the keylist.

Which brings us to a nasty part: the obvious implementation is prone to segfaulting all over the place. It’s extremely dangerous to allow any list involved get mutated “from outside” while the sort is running.

We fought that for years, as ever sicker tricks were contrived by friendly adversaries to trick the sort into crashing their box.

We currently fight that by playing an even sicker trick :wink:: early in sorting, the list object’s C pointer to data is set to NULL, and its size is set to 0. As far as any outside code is concerned (whether it be via Python-coded comparison functions, or other threads), the original list is empty for the duration. They can mutate it all they like, but it has no effect on the data sorting sees. At the end of sorting, whatever the then-current state of the original list object may be is thrown away, and an exception is raised if was ever made non-NULL. And, of course, the C pointer to the original C array is restored.

Now none of that is done for data created by key=, because the list that creates is visible only to list.sort() and so can’t be mutated by anything else.

If keylist=keys does make a copy of keys, that remains true, and the implementation remains just as simple in this respect.

But if we use a list passed in as-is, then heroic efforts have to be made to guard against that being mutated “from outside” too.

1 Like

Therefore you will copy it anyway, and in a mutable form, thus any ‘sortable’ is acceptable, thus you can name it simply keys=. Am I right ?


One possible way out of this… I introduced a special case before, I push it a bit more :

# special case : sorted-range, autosized, returns a 'permutation' type
idxs = sorted(range, keys=keys)  # o(NlogN)

# using permutation type for sorting without recomputing order
values1.sort(keys=idxs)  # o(N)
values2.sort(keys=idxs)  # o(N)

This is mathematically consistent even with the implicitness under the permutation (castable as a list of integers but skipping sorting when detected as keys). Note that computing the sorting order of a permutation returns the permutation itself.

1 Like

That’s not how it works now. Nothing in the original list is copied. it’s all C-level trickery. The pointer to the list object’s C array is remembered internally, and that pointer in the list object is set to NULL for the duration. So, sure, in that sense 8 bytes are “copied” (size of a pointer on a 64-bit box). Sorting internals then work directly on the original C array, with no fear that anyone else can get at it for the duration.

Something similar would need to be done to make a mutable keylist work: make it impossible for outside code to mutate it for the duration, but mutate its original C array freely & directly inside the sorting code. When sorting ends, then the keylist object’s C data pointer is restored, and the permuted contents become visible again. Any mutations made to the keylist object “from outside” during sorting would be thrown away, and an exception raised if any outside mutations were made.

Note that this trick wouldn’t work for an object to which memoryview() can be applied, but lists don’t participate in the buffer protocol (and almost certainly never will). Because if memoryviews are in play, the underlying C array may be shared across multiple Python objects, and then it’s not possible for the sorting code to prevent outside mutations short of working on its own internal copy of the original C array. As is, going through a list object’s methods is the only Python-level way to get at its underlying C array, which is shared with no other Python-level objects.

In a different topic :wink:, that may be a nice way to make numpy’s argsort() more pleasant in Python, but is a segfault factory for the same reasons if the sorting code doesn’t make heroic efforts to ensure that outside mutations of the keys argument can’t crash the code. The easiest way of which - but also the slowest - is to make its own internal copy of the keys argument.

1 Like