libs ()
sort_unstable
Add an unstable sort to libcore.
At the moment, the only sort function we have in libstd is slice::sort
. It is stable,
allocates additional memory, and is unavailable in #![no_std]
environments.
The sort function is stable, which is a good but conservative default. However, stability is rarely a required property in practice, and some other characteristics of sort algorithms like higher performance or lower memory overhead are often more desirable.
Having a performant, non-allocating unstable sort function in libcore would cover those needs. At the moment Rust is not offering this solution as a built-in (only crates), which is unusual for a systems programming language.
Q: What is stability?
A: A sort function is stable if it doesn't reorder equal elements. For example:
let mut orig = vec![(0, 5), (0, 4)];
let mut v = orig.clone();
// Stable sort preserves the original order of equal elements.
v.sort_by_key(|p| p.0);
assert!(orig == v); // OK!
/// Unstable sort may or may not preserve the original order.
v.sort_unstable_by_key(|p| p.0);
assert!(orig == v); // MAY FAIL!
Q: When is stability useful?
A: Not very often. A typical example is sorting columns in interactive GUI tables.
E.g. you want to have rows sorted by column X while breaking ties by column Y, so you
first click on column Y and then click on column X. This is a use case where stability
is important.
Q: Can stable sort be performed using unstable sort?
A: Yes. If we transform [T]
into [(T, usize)]
by pairing every element with it's
index, then perform unstable sort, and finally remove indices, the result will be
equivalent to stable sort.
Q: Why is slice::sort
stable?
A: Because stability is a good default. A programmer might call a sort function
without checking in the documentation whether it is stable or unstable. It is very
intuitive to assume stability, so having slice::sort
perform unstable sorting might
cause unpleasant surprises.
See this story
for an example.
Q: Why does slice::sort
allocate?
A: It is possible to implement a non-allocating stable sort, but it would be
considerably slower.
Q: Why is slice::sort
not compatible with #![no_std]
?
A: Because it allocates additional memory.
Q: How much faster can unstable sort be?
A: Sorting 10M 64-bit integers using pdqsort (an
unstable sort implementation) is 45% faster than using slice::sort
.
Detailed benchmarks are here.
Q: Can unstable sort benefit from allocation?
A: Generally, no. There is no fundamental property in computer science saying so,
but this has always been true in practice. Zero-allocation and instability go
hand in hand.
The API will consist of three functions that mirror the current sort in libstd:
core::slice::sort_unstable
core::slice::sort_unstable_by
core::slice::sort_unstable_by_key
By contrast, C++ has functions std::sort
and std::stable_sort
, where the
defaults are set up the other way around.
pub trait SliceExt {
type Item;
// ...
fn sort_unstable(&mut self)
where Self::Item: Ord;
fn sort_unstable_by<F>(&mut self, compare: F)
where F: FnMut(&Self::Item, &Self::Item) -> Ordering;
fn sort_unstable_by_key<B, F>(&mut self, mut f: F)
where F: FnMut(&Self::Item) -> B,
B: Ord;
}
let mut v = [-5i32, 4, 1, -3, 2];
v.sort_unstable();
assert!(v == [-5, -3, 1, 2, 4]);
v.sort_unstable_by(|a, b| b.cmp(a));
assert!(v == [4, 2, 1, -3, -5]);
v.sort_unstable_by_key(|k| k.abs());
assert!(v == [1, 2, -3, 4, -5]);
Proposed implementaton is available in the pdqsort crate.
Q: Why choose this particular sort algorithm?
A: First, let's analyse what unstable sort algorithms other languages use:
The most popular sort is definitely introsort. Introsort is an implementation
of quicksort that limits recursion depth. As soon as depth exceeds 2 * log(n)
,
it switches to heapsort in order to guarantee O(n log n)
worst-case. This
method combines the best of both worlds: great average performance of
quicksort with great worst-case performance of heapsort.
Java (talking about Arrays.sort
, not Collections.sort
) uses dual-pivot
quicksort. It is an improvement of quicksort that chooses two pivots for finer
grained partitioning, offering better performance in practice.
A recent improvement of introsort is pattern-defeating quicksort, which is substantially faster in common cases. One of the key tricks pdqsort uses is block partitioning described in the BlockQuicksort paper. This algorithm still hasn't been built into in any programming language's standard library, but there are plans to include it into some C++ implementations.
Among all these, pdqsort is the clear winner. Some benchmarks are available here.
Q: Is slice::sort
ever faster than pdqsort?
A: Yes, there are a few cases where it is faster. For example, if the slice
consists of several pre-sorted sequences concatenated one after another, then
slice::sort
will most probably be faster. Another case is when using costly
comparison functions, e.g. when sorting strings. slice::sort
optimizes the
number of comparisons very well, while pdqsort optimizes for fewer writes to
memory at expense of slightly larger number of comparisons. But other than
that, slice::sort
should be generally slower than pdqsort.
Q: What about radix sort?
A: Radix sort is usually blind to patterns in slices. It treats totally random
and partially sorted the same way. It is probably possible to improve it
by combining it with some other techniques, but it's not trivial. Moreover,
radix sort is incompatible with comparison-based sorting, which makes it
an awkward choice for a general-purpose API. On top of all this, it's
not even that much faster than pdqsort anyway.
Stability is a confusing and loaded term. Function slice::sort_unstable
might be
misunderstood as a function that has unstable API. That said, there is no
less confusing alternative to "unstable sorting". Documentation should
clearly state what "stable" and "unstable" mean.
slice::sort_unstable
will be mentioned in the documentation for slice::sort
as a faster non-allocating alternative. The documentation for
slice::sort_unstable
must also clearly state that it guarantees no allocation.
The amount of code for sort algorithms will grow, and there will be more code to review.
It might be surprising to discover cases where slice::sort
is faster than
slice::sort_unstable
. However, these peculiarities can be explained in
documentation.
Unstable sorting is indistinguishable from stable sorting when sorting
primitive integers. It's possible to specialize slice::sort
to fall back
to slice::sort_unstable
. This would improve performance for primitive integers in
most cases, but patching cases type by type with different algorithms makes
performance more inconsistent and less predictable.
Unstable sort guarantees no allocation. Instead of naming it slice::sort_unstable
,
it could also be named slice::sort_noalloc
or slice::sort_unstable_noalloc
.
This may slightly improve clarity, but feels much more awkward.
Unstable sort can also be provided as a standalone crate instead of within the standard library. However, every other systems programming language has a fast unstable sort in standard library, so why shouldn't Rust, too?
None.