Table of contents |
---|
This page presents a O(n2) sorting algorithm which is faster than insertion sort in some cases (specifically: when there is a lot of repetition in the data to be sorted).
(Strictly speaking this is not an O(n2) algorithm, but an O(n*m) algorithm where n is the number of items in the array to be sorted and m is the number of distinct items in the array, but since the O-notation is used to denote the worst case scenario (which happens when m=n), O(n2) should be correct by all practical means.)
Why develop a O(n2) sorting algorithm since O(n log n) sorting algorithms are so much faster, specially with large amounts of data?
The answer is precisely in the "with large amounts of data" part. O(n log n) sorting algorithms usually perform very poorly when the amount of data to be sorted is very small (eg. 10-20 items). For example, with 10 items to sort, insertion sort can be several times faster than for example quicksort.
One could then ask why that matters. If there are so few items to sort, it doesn't take almost any time at all, no matter what the sorting algorithm. However, this does matter when the amount of small groups to sort is very large. For example, if you have a million groups of 10 items and each group needs to be sorted, it's much faster to use insertion sort on each group than eg. quicksort.
This is exactly why the fastest existing sorting methods do not use
exclusively a O(n log n), but they usually
sort the data only up to a point with that sorting algorithm (eg. when
partitioning in quicksort, don't sort any partition which is smaller
than 15 items) and then they run insertion sort on the data. This trick of
running quicksort only up to a certain partition size and then running
insertion sort can speed up a pure quicksort by even 20-40%.
Since there's seldom something truely new under the Sun, it's quite
probable that a similar, if not even the exact same algorithm, has already
been developed (and named) by someone somewhere. However, since I have
never heard of this sorting algorithm or anything similar, I have named
it "double-burst-selection sort" at least for the time being.
The reason for using "selection sort" in the name is that this algorithm
is somewhat similar in idea.
The idea in selection sort is to find the smallest item in the array
and swap the first array item with it. Then this is repeated again for
the remaining of the array (ie. everything but the first item) until
all items have been placed in this way. If there's more than one smallest
item, only one of them will be placed at the beginning.
The idea came to me that this is actually quite an inefficient way of
doing it. Why should just one of the smallest items be placed
at the beginning of the array when it's possible to do so for all
of them with one pass?
This idea made me develop what I call the "burst-selection sort": After
each pass the smallest item, or most importantly, if there's more than
one smallest item, all of them will end up at the beginning of
the array. (For example, if the smallest value in the array is '1' and
there are 10 of them in the array, after one pass all of them will be
located at the beginning of the array.) When this is repeated for the
rest of the array until all the smallest items have been translated
to the beginning, the array is sorted.
After that it ocurred to me that why should I restrict the algorithm
to just place all the smallest items to the beginning? Why not, in the
same pass, also place all the largest items to the end of the array?
I discovered that it is, in fact, possible to do so and that it speeds
up the sorting by some amount. Hence I called this enhanced variant
of the algorithm "double-burst-selection sort".
1.2 The meaning of the name
2 Efficiency comparison
Since insertion sort is one of the fastest (and also simplest) O(n2) sorting algorithms, and usually the algorithm of choice to speed up other sorting algorithms like quicksort, it's logical to compare the double-burst-selection sort with it.
The greatest advantage of this sorting algorithm is that it takes advantage of repetition in the data to be sorted. The more repetitions there are in the data, the faster the algorithm is. As the table below shows, depending on the amount of repetition the double-burst-selection sort algorithm can be several orders of magnitude faster then insertion sort.
However, this algorithm starts losing to insertion sort when there's little or no repetition in the data. Another disadvantage of this algorithm is that it's much harder to understand and implement than insertion sort (a typical implementation of insertion sort takes about 7 lines of code while double-burst-selection sort takes more than 25).
The following table shows a speed comparison between the two algorithms. Both have been implemented as efficiently as possible (eg. the insertion sort does not swap items as the basic algorithm description says, but takes the item to be moved, moves bigger items one place up until the correct place for the current item is found, and the item is then placed there; this saves a lot of unnecessary reads and writes). The program used to make this speed comparison can be found later in this page.
The test program creates an array of random unsigned integer values with differing array sizes and maximum item values (the smaller the maximum item value, the more repetition there is) and measures how many times each sorting algorithm can sort the array within 10 seconds. Naturally the same array was given to both algorithms to sort for fairness. The number under the algorithm column is thus the amount of times the algorithm was able to sort the array (that is, the bigger the number, the better).
Max value | Array size | Insertion | D-B-Selection |
---|---|---|---|
10 | 10 | 4 859 733 | 4 498 987 |
10 | 100 | 296 240 | 1 114 643 |
10 | 1 000 | 3 802 | 140 492 |
10 | 10 000 | 36 | 13 200 |
100 | 10 | 4 867 741 | 3 994 877 |
100 | 100 | 283 263 | 384 414 |
100 | 1 000 | 3 675 | 31 487 |
100 | 10 000 | 32 | 2 915 |
1 000 | 10 | 4 754 145 | 4 054 443 |
1 000 | 100 | 308 434 | 191 426 |
1 000 | 1 000 | 3 551 | 5 070 |
1 000 | 10 000 | 33 | 324 |
10 000 | 10 | 4 777 478 | 4 031 111 |
10 000 | 100 | 292 915 | 183 846 |
10 000 | 1 000 | 3 633 | 2 574 |
10 000 | 10 000 | 32 | 46 |
100 000 | 10 | 4 824 157 | 4 222 801 |
100 000 | 100 | 318 131 | 155 052 |
100 000 | 1 000 | 3 713 | 2 138 |
100 000 | 10 000 | 32 | 30 |
As we can see from the table, the larger the array, the more likely it is for repetitions to happen, which benefits double-burst-selection sort. For example, with an array of 10 000 items with the range of item values being between 0 and 10, double-burst-selection sort is more than 350 times faster than insertion sort. On the other hand, with smaller arrays and less repetitions the former can be quite much slower.
As already indicated in the introduction, insertion sort is useful to speed up other sorting algorithms such as quicksort.
In order to compare double-burst-selection sort with insertion sort in a more practical use, I made an efficient implementation of quicksort1 and compared it with the same implementation but enhanced with insertion sort and another enhanced with double-burst-selection sort.
The version of quicksort enhanced with insertion sort skips partitioning partitions with less than 15 items and after it is done runs insertion sort to the whole table. The version enhanced with double-burst-selection sort skips partitioning partitions with less than 30 items and runs double-burst-selection sort to each of these partitions. (The partition sizes were fine-tuned by hand to get the fastest average sorting speed in both cases.)
All three versions of the algorithm were used to sort an array of 100 000 items containing random unsigned integer values with differing maximum values. The value in each algorithm column tells how many times the algorithm was able to sort the table in 10 seconds (thus a bigger number is better):
Max value | Quicksort | QS + Insertion | QS + D-B-Selection |
---|---|---|---|
10 | 298 | 452 | 485 |
100 | 283 | 418 | 446 |
1 000 | 275 | 399 | 425 |
10 000 | 256 | 348 | 371 |
100 000 | 256 | 340 | 308 |
1 000 000 | 251 | 331 | 266 |
As we can see from the table, enhancing quicksort with a fast O(n2) algorithm (fast for very small arrays, of course) has a very considerable impact in performance. (As said earlier, this is because quicksort is not very efficient with small partitions.)
We can also see that double-burst-selection sort presents a slight improvement over insertion sort when there's a considerable amount of repetition. When the amount of repetition is minimal, insertion sort starts being considerably faster.
The (C++) source code of the program used to make these measurements can be found here.
1 The quicksort implementation used in the program chooses the pivot item by calculating the median of the first, last and middle items of the partition. This method of choosing the pivot item introduces a considerable average speed performance compared to the regular quicksort (which just takes the first value of the partition as pivot) and even to the randomized quicksort (which chooses a random item as pivot).
First I'll explain the burst-selection sort algorithm, which is the simpler version of the double-burst-selection sort.
As explained in the introduction, the idea of this sorting algorithm is to make a pass over all the items in the array and swap items in such way that after the pass is done the smallest item has been placed at the beginning of the array. If there's more than one smallest item, all of them will be placed there. The same algorithm is repeated for the rest of the array (ie. all the items in the array except the smallest ones at the beginning) until the entire array is sorted.
In pseudocode, the algorithm is the following:
minIndex
index variable which points to the
beginning of the array (where the minimum values will be placed).
min
variable of the array item type with the
value at minIndex
. This variable will contain the minimum
item value in the array (it may not contain it at first, but will
eventually do so).
newMinIndex
index variable with
minIndex+1
. This variable contains the location where
the found minimum items will be placed during the pass.
minIndex+1
to the end of the array, if the item is less or equal to min
do:
min
, assign the item
to min
and assign minIndex
to
newMinIndex
.
newMinIndex
.
newMinIndex
by one.
newMinIndex
to minIndex
.
minIndex
is still inside the array, jump to step 2.
If an item is not a smallest item, it's possible to check to see if it's the largest item and if it is, move it to the end of the array. This way we can achieve two goals at the same time and perform the job faster. This kind of addition to the burst-selection sort makes it somewhat faster (but not twice faster as one could hastily think because the amount of work done does not actually drop to half, but it drops somewhat nevertheless).
This addition makes the algorithm a bit more complex because swapping the current item with the item at the current end of the array may actually bring a smallest item at the current place which means we can't move to the next item in the array just yet (if we did, the algorithm would malfunction in some cases). Also, if a new minimum item is found (and is swapped to the start of the current array section), the value swapped to the current item place may be a largest item, which also means that we can't move to the next item yet. Otherwise the addition can be done basically by replicating the burst-selection sort code in a way that it makes the reverse job.
These additions need to be done to the burst-selection sort algorithm in order to make it a double-burst-selection sort:
minIndex
, create also a
maxIndex
index variable initialized to point to the
last item in the array.
min
variable, initialize also a
max
variable with the item at maxIndex
.
newMinIndex
, create also a
newMaxIndex
index variable initialized to
maxIndex
(note: not to maxIndex-1
).
minIndex+1
to
newMaxIndex
(inclusive).
min
, then check if it was greater or equal to
max
and if it was, perform the same code as with the
minimum case, but using the max versions of the variables
(note that newMaxIndex
should naturally be decreased by
one, not increased).
min
or greater or equal to max
, but it
should stay at the same item in its next loop (that is, it should
not increase the loop index variable).
newMaxIndex
to
maxIndex
.
minIndex
is
less than maxIndex
.
The source code of the test program has an implementation of this algorithm in C++ at the beginning.
© Copyright 2004 Juha Nieminen