Not really. It was a long time ago, and memory has faded.
Perhaps that when you ask “an expert” for advice, they may be so obsessed with novel approaches that they miss obvious things
.
I don’t believe they did. Just brain freeze on my part.
Merge sort can be made in-place simply enough but at the cost of another lg n factor. Given adjacent runs A and B, search for the largest non-negative int K such that the first K elements of B belong before the last K elements of A. Binary search can be used for this, although getting it exactly right is delicate, and exponential search would probably be better for general use (K is usually small for runs built from randomly ordered data).
Then you have two adjacent, equal-sized, blocks to swap. How to do that in place should be obvious.
Now you’re left with two merges to do, each of which follows the same process.
Easiest to follow with an example:
A: 2 4 6 B: 1 3 5
1 < 6, so K >= 1. Then 3 < 4, so K >= 2. But it’s not the case that 5 < 2, so K=2 is the result. Swap the last 2 elements of A with first 2 of B, leaving two smaller pairs of runs to merge:
A:2 B:1 3
A:4 6 B:5
K happens to be 1 for both pairs. For the first, 1 is swapped with 2, leaving the 2 new pairs to do:
A:empty B:1
A:2 B:3
and K=0 for both, so both finish at once. Similarly for the other original pair.
OK - the coding isn’t really “simple” after all
. But it’s the simplest in-place sub-quadratic-worst-case stable comparison-based sort I know of.
In a merge sort that already works, it just requires replacing the “merge two adjacent runs” part. Easiest to write it recursively, but with the usual quicksort trick to limit recursion depth:
# merge adjacent pairs A and B
while A and B: # if either is empty, there's nothing to do:
# as sketched, create new A1, B1 and A2, B2 pairs in place
# recurse on the pair with the smallest total length
# set A and B to the larger pair and loop
Of course if one of the pairs is a singleton, it would pay to search for its final position direclty instead. As is, e.g.,
A:2 3 4 5 6 7 8 9 B:1
acts like a slow insertion sort, moving 1 “one to the left” each time. Still linear-time, but slothful. Etc.