« back

All Sorts of Things - Quick Sort

25 Oct 2012

You can never get enough..... Quicksort via folk dance.

Quick sort is a divide and conquer algorithm that on average, makes O(n log n) comparisons. Quick sort works in essence by taking a list and dividing in to two smaller lists over and over again. The two lists usually consist of one with smaller elements and one with larger elements. Then Quick sort uses the power of recursion to sort the other lists in the same way. There are really two different main versions of implementing this algorithm. One where you physically break the list up into other lists and then add them back together again, and one where you do the sorting "in place" without ever actually breaking apart the list. In this walk through, we will be physically breaking up the list. If you are interested in the in place version, look out for my next post which will feature the in place quick sort. Let's begin.

The fist thing we will do is create a method called quick_sort. It will take an array. Since we will be dealing with recursion, we will need a base case to break us out of our loop when necessary. If the array is less than 2, there will be nothing to sort so we can return the array safely.

1         def quick_sort(array)
2             return array if array.length < 2
3         end
4     

The next thing we want to do is start dividing the lists up into lists of smaller and larger elements. "But compared to what?" In order to divide the list up into a list of smaller elements and a list of larger elements, we need an element to compare the other elements against. This element is usually called the pivot. Now, there are many different ways that you can select a pivot and much debate over which way is "correct" (usually meaning the way that makes your method faster). I decided to play around with 2 different ways of selecting pivots. The first way I used was just choosing a random value. This way represents the camp of the people who claim it makes no difference in speed which way you go about it. The second way I used was from the camp of people who said that there is a very specific implementation of "the best way". I wanted to see if there was a difference. Keep an eye out for my upcoming post on profiling to get a look at how these different implementations compare against each other in the speed department. Anyways, here are the two different ways I used to find the pivot...

1         def quick_sort(array)
2             return array if array.length < 2
3             pivot = array[ rand(array.length) ] // returns a random pivot
4         end
5     

or you could set pivot equal to my find_pivot method and pass it the array.

1         def quick_sort(array)
2             return array if array.length < 2
3             pivot = find_pivot(array) // uses find_pivot
4         end
5     

The find_pivot method works by selecting the first element of the array, the last element of the array, and then also selecting the midpoint. It then sorts them from smallest to greatest and chooses the median. The case for this has to do with the fact that if you pass in a list that is already sorted in an ascending or descending order, a random pivot will not always be the most efficient and the this way will find a "better", "faster" pivot.

 1         def find_pivot(array)
 2             pivot_possiblities = []
 3             pivot_possiblities << array.first
 4             midpoint = array[array.length/2]
 5             pivot_possiblities << midpoint
 6             pivot_possiblities << array.last
 7             2.times do |i|
 8                if pivot_possiblities[i] > pivot_possiblities[i+1]
 9                  pivot_possiblities[i], pivot_possiblities[i+1] = pivot_possiblities[i+1], pivot_possiblities[i]
10                end
11              end
12             pivot_possiblities[1]
13         end
14 
15     

Moving on. Once we have decided on a pivot, we want to grab all of the other elements that are the same as the pivot (if there are any) and put them into an array together.Then let's delete them from the original array since we will have already separated them out.

1         def quick_sort(array) #choosing a random pivot
2           return array if array.length < 2
3           pivot = array[ rand(array.length) ]
4           all_pivots = array.select { |i| i == pivot }
5           array.delete(pivot)
6         end
7     

Now generally, in quicksort, convention is that you work first with the side that is less than the pivot and then move to the side that is greater. Let's pull out all the values less than the pivot and then all the values that are greater.

1         def quick_sort(array) #choosing a random pivot
2           return array if array.length < 2
3           pivot = array[ rand(array.length) ]
4           all_pivots = array.select { |i| i == pivot }
5           array.delete(pivot)
6           less_than_pivot = array.select { |i| i < pivot }
7           greater_than_pivot = array - less_than_pivot
8         end
9     

We don't have much more to do here. We need now to repeat this process to the divided list so now we can use recursion. Let's do the same thing to the less_than_pivot and then to the greater_than_pivot. We can add them back together as well but don't forget the pivots!

 1         def quick_sort(array) #choosing a random pivot
 2           return array if array.length < 2
 3           pivot = array[ rand(array.length) ]
 4           all_pivots = array.select { |i| i == pivot }
 5           array.delete(pivot)
 6           less_than_pivot = array.select { |i| i < pivot }
 7           greater_than_pivot = array - less_than_pivot
 8           quick_sort(less_than_pivot) + all_pivots + quick_sort(greater_than_pivot)
 9         end
10     

WHEE!

comments powered by Disqus