Merge Sort

by Mirabelle Jones

Now that we’ve had a look at some of the more basic sorting algorithms (bubble sort, selection sort, and insertion sort) which might be OK for smaller datasets but aren’t super efficient in general, let’s start to have a look at some of the more advanced algorithms for sorting. These algorithms (including merge sort, quick sort, and radix sort) are more difficult to explain as well as to code, but our goal isn’t to be able to write them from scratch right away! We really just want to see if we can understand the logic of how they work, exploring what is done in each step as well as getting an overall understanding. If the following seems like a jump from the last sections in terms of being a bit harder, it is (just a bit)! But we’ll go slowly and explain things piece by piece.

Our first more complex sorting algorithm is merge sort. Merge sort (as the name implies) focuses on two processes: merging and sorting. With merge sort we split up our data and then merge it in order to sort it. In other words, we’ll be using a system of “divide and conquer” to split up the array into smaller arrays. Before we go further, take a minute to ponder how we might have used an approach like this in some of our previous examples (Legos, Russian nesting dolls, etc.).

If you think back to insertion sort, you might remember the fact that we started our comparison with the second item in the array because we assumed the first item was “sorted.” That was a little strange, right? But it also makes sense because any array of 1 or 0 might as well be considered “sorted.” You can’t compare it to anything else if the array only has 1 or 0 elements! So why am I telling you this? Is it so that the next time you topple a box of Legos you can just put each Lego in its own pile or 1 and declare it sorted? No! I’m telling you this because merge sort exploits this pretty weird fact to do something kind of amazing.

First, merge sort takes our big ol’ array (list of numbers, Russian nesting dolls, Legos, etc.) and breaks it down into smaller arrays until it has arrays the of size 1. It then re-creates arrays of 2 by sorting and merging paired elements together. Because the first element is already sorted, this is easy (for a computer, a little strange for a human!) Next, it merges its arrays of 2, to create sorted arrays of 4. It continues this until everything is merged. The result is a nice n’ tidy sorted array.

If you’re scratching your head trying to imagine what that process looks like, you’re not alone in finding it a bit confusing! Have a look at the visualization below and read over the steps once again if you need a reminder of what each step is doing:

It’s not a very human approach. But it IS fast! Have a look at the following comparison between merge sort and our previous sorting methods (plus shell sort which we haven’t discussed yet) to see how it performs on a variety of arrays.

Comparison graphic from: https://www.toptal.com/developers/sorting-algorithms

Wow! Merge sort is pretty dang fast. But why is it so much faster than our other algorithms? To find out, let’s dive a little deeper using a scenario.

Scenario

Let’s say we have a similar scenario to our Lego situation only this time you accidentally knock over a box of sewing needles of various sizes in your workshop dumping them all over your desk so they’re all mixed up:

We have all been here. And it sucks.

Oof! I can feel our collective anxiety just ticked up a notch! Now, you are the kind of person who likes a well-organized sewing station, so you determine to put the sewing needles back in order. You start by isolating each nail but it’s too hard to see what size it is just by eye-balling it. So then you try putting one needle next to another. By laying two needles side by side, it’s easy to see which is longer and which is shorter:

You decide to be cool like merge sort and put them in piles of two with the shorter needle on the left and the longer needle on the right. You continue to do this for all your needles. Next, you put them in groups of 4 following the same logic (comparing pairs) so that all the shorter needles end up on the left and longer ones end up on the right. You continue to do this with larger groups of needles until you finally you end up with your sorted array:

In our pseudocode (i.e. not fully working code), we’ll see different functions for merging and sorting which makes it look more complicated than it actually is. Let’s start with the pseudocode for our merge function:

function merge(pile1, pile2){
    results = [];
    item1 = 0;
    item2 = 0;
    while(item1 < pile1.length and item2 < pile2.length){
        if(pile2[item1] > pile1[item2]){
            results.push(pile1[item1]);
            item1++;
        } else {
            results.push(pile2[item2])
            item2++;
        }
    }
    while(item1 < pile1.length) {
        results.push(pile1[item1])
        item1++;
    }
    while(item2 < pile2.length) {
        results.push(pile2[item2])
        item2++;
    }
    return results;
}

Dang! That seems like a lot of steps for just putting two things together, huh? Let’s break this down.

We create our function which needs to know which two arrays we’re going to merge, so we give it pile1 and pile2 (of our needles, Legos, etc.) as parameters:

function merge(pile1, pile2){

We create an empty array which will hold our merged arrays in the end.

results = [];

Then we create two variables to help us look through our two arrays. One will be called item one because we’re looking at an item from array one and the next will be called item two because we’ll be comparing it to an item in array two. We set them both to zero because we want to start at the first number in our array and computers always start counting from 0.

item1 = 0;
item2 = 0;

Next, we’ve got a while loop which checks to see if we haven’t gone past the length of either of our arrays.

while(item1 < pile1.length and item2 < pile2.length){

Inside the while look, we check to see if the item we’re examining from pile 2 is bigger than the item in pile 1. If it is, we push this item into our results array and move on to the next item in pile 1. If it’s not, we push the item from pile 2 into our results array and move on to the next item in pile 2.

if(pile2[item2] > pile1[item1]){
            results.push(pile1[item1]);
            item1++;
        } else {
            results.push(pile2[item2])
            item2++;

You’d think this would be the end of it right? We just keep going until all the items from pile 1 and pile 2 have been compared, putting the smaller item in our results array so it gets sorted, yes? Well, no. What if we have two arrays that are of different sizes? That means there will be some extra numbers that don’t get compared because there’s nothing to compare them to! For this reason, outside of our while loop we have the following two while loops:

 while(item1 < pile1.length) {
        results.push(pile1[item1])
        item1++;
    }
    while(item2 < pile2.length) {
        results.push(pile2[item2])
        item2++;

Finally, we return our merged megapile aka: results.

return results;

So that’s how our merge function works. But what about the merge sort function itself? That’s a little trickier to explain, but it might look like this:

function mergeSort(arr){
    if(arr.length <= 1) return arr;
    let mid = Math.floor(arr.length/2);
    let left = mergeSort(arr.slice(0,mid));
    let right = mergeSort(arr.slice(mid));
    return merge(left, right);
}

Here we declare our mergeSort function and feed it an array of data we’d like sorted as a parameter.

function mergeSort(arr){

Next we want to see if our array is less than or equal to 1, and if it is we don’t really need to sort it, so we just return it (and we’re done).

if(arr.length <= 1) return arr;

But probably our array isn’t 0 or 1, so let’s move on to the next step which is setting a mid point then dividing our array into left and right:

  let mid = Math.floor(arr.length/2);
  let left = mergeSort(arr.slice(0,mid));
  let right = mergeSort(arr.slice(mid));

To find our middle, we just divide the length of our array in half. Then we divide the array into two by creating a slice from 0 to our midpoint which we call left, and a slice from the midpoint to the end which we call right. On this slice, we want to also perform a mergesort! Which means the function is calling itself. This is perhaps a little confusing if it’s the first time you’ve encountered a function that calls itself (which we call a “recursive function”*), but in a nutshell it makes sense based on what we’ve seen of mergesort already. You can think of it simply as the function continuing to merge and sort until it only has two arrays. Finally we merge our data:

return merge(left, right);

*A cool visual example of recursion which I find helpful if you want to understand the concept more thoroughly is here but you also probably don’t have to entirely understand it to still understand merge sort in essence:

If you’d like to learn more about merge sort including how to implement it in real code, check out the sections below.

Big O Complexity

Time Complexity: O(n log n)

Space Complexity: O(n)

Visualizations

Audio-visualization by Timo Bingmann Warning: kinda loud n’ flashy!

Pseudocode

1. Break your array in half and then break those halves in half and so on until you have arrays of 1 element. 
2. Merge the arrays of 1 element into sorted pairs.  
3. Merge and sort these pairs into arrays of 4.  
4. Continue to merge and sort arrays until everything is sorted. 

Code Examples

Javascript

// Merge function
function merge(arr1, arr2){
    let results = [];
    let i = 0;
    let j = 0;
    while(i < arr1.length && j < arr2.length){
        if(arr2[j] > arr1[i]){
            results.push(arr1[i]);
            i++;
        } else {
            results.push(arr2[j])
            j++;
        }
    }
    while(i < arr1.length) {
        results.push(arr1[i])
        i++;
    }
    while(j < arr2.length) {
        results.push(arr2[j])
        j++;
    }
    return results;
}

// Merge Sort
function mergeSort(arr){
    if(arr.length <= 1) return arr;
    let mid = Math.floor(arr.length/2);
    let left = mergeSort(arr.slice(0,mid));
    let right = mergeSort(arr.slice(mid));
    return merge(left, sright);
}

mergeSort([100, 22, 716, 3, 1, 703])

Sample Problems

Come back soon for these!

More Resources