Sometimes, I think Ruby makes it too easy for us. Just doing `[1,18,4,72].sort`

gives us a nice sorted list. But what does Ruby actually do under the hood?

The point of understanding sorting algorithms has very little to do with the actual act of sorting. Rather, the different algorithms are great examples of various techniques that can be applied to large set of problems. In this article, we’ll take a look at some of these algorithms and the underlying ideas, as well as implementations in Ruby.

## The Problem

Before we start out, let’s get on the same page about the problem we’re trying to solve. The input to our algorithm will be an array of arbitrary length consisting of integers (not necessarily positive). Our algorithm should return a version of this array sorted in ascending order. If we want descending order, we can either reverse the resulting array or change the algorithms presented slightly (e.g. the comparison operator used).

## Bubble Sort

Let’s first look at a really simple (and also pretty slow) sorting algorithm known as “Bubble sort”. The idea is pretty simple: Walk through the list and put two adjacent elements in descending order. But, here’s the kicker: We have to repeatedly walk through the list until there are no longer any swaps to make, meaning, the list is sorted.

The Ruby implementation can be written easily from the hand-wavy description:

```
def bubble_sort(array)
n = array.length
loop do
swapped = false
(n-1).times do |i|
if array[i] > array[i+1]
array[i], array[i+1] = array[i+1], array[i]
swapped = true
end
end
break if not swapped
end
array
end
```

We maintain a variable called `swapped`

, which is true if any swaps were made in the pass through the array. If, at the end of a walk through the array, `swapped`

is `false`

(i.e. no swaps have been performed), we are done and return a sorted list. Essentially, `swapped = true`

is the termination condition for the forever loop.

I mentioned at the outset that BubbleSort is a pretty slow algorithm, but what exactly do I mean by that? BubbleSort is bad at scaling, i.e. making our input size larger by a little bit results in a pretty large increase in the running time of the algorithm. What exactly is the relationship between input size and relationship? Well, let’s think about it.

The worst possible situation for BubbleSort occurs when the input array is in descending order, because this means we have to perform the maximum number of swaps. In fact, in the worst case, if we have `n`

elements in the input array, we have to perform \[ c * n^2 \] swaps, for some positive real constant `c`

(i.e. `c`

doesn’t change when `n`

does). Assuming that each swap takes a constant amount of time (i.e. the time taken to perform a swap is unaffected by the input size), the running time of BubbleSort increases quadratically with respect to the input size. So, we can say we can say BubbleSort is \[O(n^2)\] which is equivalent to saying the running time scales quadratically.

We can also think about the amount of memory in order to perform the sort. Since the sort is “in place”, we don’t have to allocate any additional memory. Thus, we say that BubbleSort’s space complexity is \[O(1)\] i.e. constant at most.

## Merge Sort

It turns out we can do quite a bit better when it comes to sorting. A technique called “divide and conquer” takes the original problem and produces solutions for smaller pieces of it, which are together in order to create the final solution. In our case, this is the problem statement:

Sort the array A of length n

What if we did this instead:

Sort the left of half of A and the right of A, then combine them.

We can actually split each of the halves into further halves (quarters of the original part) and continue recursively. Turns out, we’ll eventually reach an array of size 1, which is automatically sorted. Then, take all of the pieces and finally combine them to give the equivalent of `A.sort`

. However, the combining process is a bit involved.

Say we’re given two sorted arrays, like [1, 3, 5] and [2, 4, 6, 7] and want to combine them into one sorted array. Here’s how to do that, in this case. Take a look at the first elements of both arrays and put the smaller one into the result final result. After one run of this, we have this in the final result:

`[1]`

Repeat the process except move along the array, making the next element the “first element”. In this case, make 3 the “first element” of [1, 3, 5] and compare it to 2 from [2, 4, 6, 7]. We keep doing this until we’ve exhausted one of the lists. At this point, the final result looks like this:

`[1, 2, 3, 4, 5]`

Then, just push on the rest of the second list to get a combined, nicely sorted list:

`[1,2,3,4,5,6,7]`

We can use the procedure described with the example as our “merge” or “combine” step. So, here’s an outline of mergesort for an array `A`

with length `n`

:

```
Mergesort(A):
1. return A if n == 1
2. left = left half of A
3. right = right half of A
4. sorted_left = Mergesort(left)
5. sorted_right = Mergesort(right)
6. return merge(sorted_left, sorted_right)
```

Let’s take a look at the Ruby rendition of this:

```
def mergesort(array)
def merge(left_sorted, right_sorted)
res = []
l = 0
r = 0
loop do
break if r >= right_sorted.length and l >= left_sorted.length
if r >= right_sorted.length or (l < left_sorted.length and left_sorted[l] < right_sorted[r])
res << left_sorted[l]
l += 1
else
res << right_sorted[r]
r += 1
end
end
return res
end
def mergesort_iter(array_sliced)
return array_sliced if array_sliced.length <= 1
mid = array_sliced.length/2 - 1
left_sorted = mergesort_iter(array_sliced[0..mid])
right_sorted = mergesort_iter(array_sliced[mid+1..-1])
return merge(left_sorted, right_sorted)
end
mergesort_iter(array)
end
```

The main procedure of interest here is `merge`

. Essentially, move along the two halves, while adding on the smaller value to the end of `res`

until we have exhausted one of the halves. At this point, simply pile on what remains of the other list. And, that’s MergeSort for you!

## What’s the point?

So, what is the benefit of implementing a significantly more complicated algorithm than good old BubbleSort? It is all in the numbers.

Without getting into the details too much (they involve solving a recurrence), we’ll say that mergesort is \[ O(n log_2{n}) \] It turns out that this is *a lot* better than \[O(n^2)\] Let’s make another simplification. Say MergeSort uses \[n /log_2(n)\] operations and BubbleSort uses \[n^2\] operations to sort an array of length `n`

If we set `n = 1,000,000`

then MergeSort requires roughly \[6 \cdot 10^7\] operations whereas BubbleSort weighs in 17,000 times worse at \[1 \cdot 10 ^ {12}\] operations.

## Divide and Conquer

MergeSort shows us a nice strategy: divide and conquer. It turns out there are a lot of problems which can be solved by breaking them into subproblems and the combining the resulting “subsolutions”. It takes skill and practice to identify cases where you can use divide and conquer, but MergeSort is a great example to use as a springboard.

The take home points are: If you have some sort of trivial base case (in MergeSort, a list of length 1) or are somehow able to break your problem down to this base case and have some kind of idea to combine solutions, divide and conquer might be an avenue to explore.

## Wasting Memory Like No Tomorrow

But wait, we can do even better than MergeSort! But, this time we’ll have to spend, um, a bit more memory. We’ll get straight to the Ruby:

```
def mark_sort(array)
array_max = array.max
array_min = array.min
markings = [0] * (array_max - array_min + 1)
array.each do |a|
markings[a - array_min] += 1
end
res = []
markings.length.times do |i|
markings[i].times do
res << i + array_min;
end
end
res
end
p mark_sort([3,2,19,18,17])
```

We have an array called `markings`

representing the number of times a certain number occurs in `array`

. Each index of `markings`

corresponds to a number in the range `array.max`

and `array.min`

, inclusive. We initialize `markings`

with 0s and walk through the array, incrementing a number’s marking when we see it occur in the array. Finally, just print out the numbers seen the right number of times. We’re making a few passes of the array, but we never make a “pass inside a pass”, i.e. a nested loop. So, our algorithm performs \[ c * n \] operations for some constant `c`

for an array of length `n`

We can say that it is \[O(n)\] in other words, blazing fast. But, it isn’t always useful, since we have to use a potentially giant piece of memory. For example, if we were given `[1,1000000]`

as an input, we’d have to allocate a million element array! But, if we know our ranges will be small or have lots of memory to waste, this algorithm can come in handy.

## Wrapping It Up

I hope you’ve enjoyed this tour of sorting algorithms in Ruby. Drop questions in the comments below!