(Back to index)

Bubble sort misconceptions

Bubble sort is like a pestilence that just refuses to die. It's just incredible how resilient it is. It will always live as some kind of "good basic O(n2) sorting algorithm". If you google for web pages discussing sorting algorithms or participate in a forum discussion about them, bubble sort will always pop its ugly head, but not in a negative way, but as some kind of good-enough basic sorting algorithm.

All kinds of odd misconceptions about bubble sort are abundant. Even people who know about sorting algorithms, their technical details and things like asymptotical complexities, have all kinds of odd misconceptions about bubble sort.

Here are a few, debunked:

Bubble sort is the best first sorting algorithm to teach

No, it isn't. I have written an entire page about why bubble sort is a horrible sorting algorithm to teach as some kind of first basic sorting.

Bubble sort is the easiest sorting algorithm to understand

No, it isn't. Compared to both insertion sort (the best O(n2) sorting algorithm) and selection sort, bubble sort is quite complicated for a beginner programmer to understand. (More precisely, it's surprisingly difficult to understand why it works.)

Both insertion sort and selection sort are much closer to how people intuitively would sort a collection of things, which makes them much easier to understand. Bubble sort is very exotic and has nothing to do with how people would intuitively sort things.

Bubble sort is the easiest to implement

No, it isn't. Both insertion sort and selection sort are at least as easy to implement, and require approximately the same amount of code.

Bubble sort is efficient with small amounts of data

No, it isn't. Not compared with other O(n2) sorting algorithms, one of the best of them being insertion sort.

There exist no cases where bubble sort would be faster than insertion sort. Insertion sort always beats bubble sort (this is actually rather easy to prove mathematically). Since insertion sort is easier to understand and equally easy to implement, there just is no disadvantage in using it rather than bubble sort.

Bubble sort works well with data which is already almost sorted

No, it doesn't. This is actually one of the oddest misconceptions about bubble sort out there, and it's surprisingly common.

Bubble sort does not benefit at all from how the data is organized, when counting as the number of element comparisons it makes. Bubble sort always performs the exact same amount of comparisons for a certain amount of data regardless of how the data is organized.

Bubble sort always performs n-1 passes through the data (where n is the amount of elements), and it always performs (n-1)+(n-2)+...+1 comparisons regardless of how the data is organized to begin with.

In contrast, insertion sort does benefit if the data is already almost sorted. The best case scenario is when the data is already fully sorted, in which case insertion sort goes through the array once and performs n-1 comparisons, and that's it.

(Yes, you can enhance bubble sort a bit and make it stop if it doesn't perform any swaps during a pass, but this will help only in a very small amount of cases. Insertion sort benefits a lot more easily from almost-sorted input without having to fine-tune or enhance it.)

The best case of bubble sort is O(n)

Very similar to the above misconception, and also false. As already said, bubble sort always performs n-1 passes through the data, period. Its best case and worst case behaviors are completely identical.

The fact is: Even if you wanted, you can't create input data for bubble sort which would be ideal for it. (Unless you add the "did we swap anything in this pass?" enhancement, in which case a completely sorted array would be optimal.)

The worst case of this misconception I have seen is at this page. Not only does it list the wrong best case scenario for bubble sort, but it also writes:

I had to sort a large array of numbers. The values were almost always already in order, and even when they weren't in order there was typically only one number that was out of order. Only rarely were the values completely disorganized. I used a bubble sort because it was O(1) for my "average" data.

I would really like to see his "average O(1)" bubble sort implementation.

(If you don't understand what's horribly wrong with that assertion, it's because "O(1)" would mean that the algorithm doesn't need to go through the data even once. In other words, he is claiming that bubble sort is somehow magically able to sort his almost-sorted data without having to go through the entire array. This is simply a physical impossibility for any sorting algorithm. It's impossible to know where the out-of-place elements should go without going through the data. The best case scenario you could conceivably achieve is if you know exactly which elements are out-of-place, and then if you would somehow be able to search their proper place in O(log n) time and insert them there in constant or O(log n) time, in which case your sorting becomes O(log n). However, this would still be slower than O(1). Also this hypothetical situation would have absolutely nothing to do with bubble sort.)

Bubble sort is good to speed up quicksort

No, it isn't. Bubble sort is perhaps the worst possible O(n2) sorting algorithm you could choose to speed up quicksort.

Insertion sort is the best O(n2) algorithm to speed up quicksort. (For example, insertion sort benefits enormously speeding up quicksort when there's a lot of repetition in the data to be sorted. Bubble sort, as mentioned, does not benefit at all from this.)

Ok, the basic bubble sort is bad, but you can create enhanced variants which are good

You can try to enhance bubble sort to death, but you will never beat insertion sort as an O(n^2) algorithm (not in the average case nor in the vast majority of cases either).

The more you try to enhance bubble sort, the longer and more complicated the sorting algorithm becomes. This increases the constant factor of the algorithm, making it slower. Basically the longer your inner loop is, the slower the entire sorting process will be.

The beauty of insertion sort is that it's very short, with a very small inner loop, and thus very fast for small amounts of data (or data which is properly organized, like eg. partitioned with partitions of certain size). You just can't beat that by adding features to bubble sort.

I just need to sort 10 elements once per minute. It doesn't matter which sorting algorithm I use. I'll just use bubble sort.

By doing this (especially if you do it eg. in an open source program) you will be prolonging the life of this horrible algorithm and possibly teaching other people the bad habit.

This is especially bad when considering that there's absolutely no logical reason to do that. Insertion sort is beautiful to sort very short arrays, at least as easy to implement, much easier to understand, and much more efficient (even if that doesn't really matter in this case). There just is no reason to not to prefer insertion sort. There are tons of reasons to prefer it.

Other misconceptions

Some people defending bubble sort don't even know how bubble sort is implemented. For example, I have seen the following algorithm claimed to be bubble sort:

    for(int i = 1; i < n; ++i)
        for(int j = 0; j < i; ++j)
            if(array[i] < array[j])
                swap(array[i], array[j]);

This is not bubble sort.

Curiously, many people (even expert programmers) have hard time figuring out exactly which sorting algorithm this is.

In fact, it's insertion sort. A rather poorly implemented version of it. (It's an insertion sort which always performs O(n2) steps regardless of what the array contains, and does not benefit at all eg. from the array being almost sorted or having tons of repetition.)

(If you can't see why it's insertion sort, it's because it performs the same algorithm as regular insertion sort, but using the ith element as a placeholder for the swapping when moving all elements one step towards the end of the array).

Rather ironically, this is a version of insertion sort which is as inefficient as bubble sort. However, it's not bubble sort. Actual bubble sort looks like this:

    for(int i = n-1; i > 0; --i)
        for(int j = 0; j < i; ++j)
            if(array[j+1] < array[j])
                swap(array[j+1], array[j]);

The difference is subtle, but when you think how elements are moved around, rather drastic. It performs the same amount of steps as the previous algorithm, though (which might be why some people confuse the two).

Also read my other page about why teaching bubble sort as the first algorithm is a bad idea.

(Back to index)