Quicksort
Quicksort is one of the fastest known sorting algorithms for evenly distributed data and is not difficult to
implement. For very small amounts of data, however, quicksort can actually be slower than other, more naïve
algorithms, such as bubble sort.
The algorithm uses a recursive "divide and conquer" technique to sort the data. The elements to be sorted are
split into two partitions. These two partitions are then sorted individually using quicksort, and when this is
finished, the two partitions will make up the complete, sorted array.
We decide how to split the array into the two partitions by choosing a value called the "pivot". Then, elements
in the array which are smaller than the pivot are moved to the left-hand side of the array, and elements larger
than or equal to the pivot are moved to the right-hand side of the array. These two sets of elements (those < the
pivot, and those >= the pivot) are the two partitions.
Once the array is partitioned, we take both of these partitions and then apply the quicksort algorithm to both
of them, separately. As we run quicksort on these two partitions, they'll both in turn be split into partitions
of their own. We keep recursively splitting the array into partitions in this manner until we reach a lower limit --
where the partitions have only one element in them. Once we get to this stage, the partition is actually sorted,
so we can leave it be.
Selecting a pivot
Selecting a good pivot greatly improves the speed of the quicksort algorithm. Many people just use the first element in
the list as the pivot, however this causes the sort to perform very badly if the data is already sorted. A much better
method is the median-of-three. This takes the first, last and middle elements, and uses the median of these as the pivot.
Example 1: Selecting a pivot using median-of-three:
The first, middle and last elements are 2, 8 and 7 respectively. The middle of these is 7, so this is what we will use for
the pivot. Had we just taken the first element as the pivot, we would have had 1 item in the left hand group, and 7 items
in the right hand one! This is not very good because the sort works best when both groups are roughly equal in size.
The unsorted data (the 3 elements used to find the pivot are coloured blue)|
|
Creating the groups
Example 2: Dividing into groups
This simply shows how the 2 groups are created; one for data greater or equal to the pivot and the other
for data less than the pivot. The initial groups (containing 1-5 and 7-9 respectively) are broken into 2 more
groups each (containing 1-2 and 4-5 from the first group, and 7 and 9 from the second).
As you can see, just by continually breaking down the data into groups, we are sorting the data - the only
elements still out of order after the second iteration are 1 and 2. Sorting finishes when everything has been
grouped enough to have completely ordered the data. Please note that the pivots chosen in this example were
chosen to best illustrate the creation of groups -- they were not selected using the median-of-three method
mentioned above.
Grouping the data (pivot is highlighted)
|
|
|
|
Grouping the data again (pivots are highlighted)
|
|
|
|
| |
|
| |
|
|
|
Being able to quickly create groups of roughly equal size is the essence of an effecient quicksort routine (determining
which element to make the pivot is discussed later).
The most common method is to have 2 pointers - one that initially points to the start of the
array (call this pointer L for left) and another that initially points to the end (call this
pointer R for right).
-
Choose a pivot element (see Selecting a pivot)
-
Swap the pivot element with the last element
-
First, the program looks at pointer L and moves it right, checking each element in turn. As soon as
it finds an element whose value is greater or equal to the pivot, it stops. It also stops if it passes pointer R.
-
Now the program moves pointer R left until it finds an element whose value is less than or equal to the pivot, or it
has passed pointer L.
-
If pointers L and R have not met, swap the 2 elements they point to, and continue from step 3.
Otherwise, swap the last element (which is the pivot because of step 1) with the element that pointer L is at.
-
For both groups, check if the size is greater than 1. If so, perform a quicksort on the group. If neither group has more
than one element, you've finished sorting this area.
With thanks to Cameron McCormack for rewording the introduction.