Sorting Algorithms: Merge Sort
Let’s Sort! In this series, I’ll be covering sorting algorithms. Most programming languages will come with a built-in sort function but in order to write better code, you need to know what’s going on in the background. If you are preparing for software engineering interviews, it’s very likely that one or more sorting algorithms may come up during your interviews. The sorting algorithms I’ll be covering in this series are:
- Bubble Sort
- Insertion Sort
- Selection Sort
- Merge Sort
- Quick Sort
- Radix/Bucket Sort
- Heap Sort
What is Merge Sort?
Merge Sort is an efficient algorithm used to sort lists of values. Understandably, to obtain this efficiency, Merge Sort is a bit more complex than other sorting sorting algorithms like Bubble Sort, Insertion Sort, and Selection Sort.
How does it Work?
Merge Sort is a Divide-and-Conquer algorithm. We take our input array, divide it in half, and then sort each of those halves and join them back together. How do you sort the halves? You divide them again. And again. Until you reach the point where you cannot divide halves any further and you only have two values to compare.
Comparing two values is easy right? Once you compare the two values, we can merge them together in the right order!
Let’s go back to the array example that we’ve used in our previous sections:
Let’s split our input array in half:
We are going to work from left to right so let’s split the left half again:
We now have two subarrays [6, 3] and [9, 8]. Let’s continue work on the left half. Divide [6, 3] into subarrays:
We now have two subarrays of length 1. This is where the second part of merge sort occurs; comparing two values and sorting them. Out of the two subarrays we are comparing, we are going to create a new array that is sorted.
Compare the two numbers: 6 is greater than 3 so 3 is first and 6 is second.
The left portion is sorted in the new array so let’s now look at the right half:
9 is greater than 8 so in our new sorted array, 8 is first and 9 is second:
We now have two sorted subarrays. We can combine them into an array of length 4 by comparing the first value in each array. We set two pointers at the beginning of each subarray and compare values. The reason we do this is because we know that at the start of each sorted subarray, we will have the smallest values:
3 is smaller than 8 so we start a new array with 3. We move the pointer at 3 to the right.
Let’s now look at 6 and 8. 6 is smaller than 8 so we add 6 to the new sorted array (in green).
Our pointer moves to the left of 6 and we find that we are now outside the length of one of the subarrays so we take whatever is left in the other array and add it to our sorted array in green:
Now that the left half of our original array is sorted, let’s turn to the right half. We have to split this array into two subarrays so let’s do that first:
Now we have two subarrays [4, 6] and [5, 2]. Lets’s divide the first half [4, 6] into subarrays:
We now have two subarrays of size one: [4] and [6]. Compare the two values and create a new array with the values in sorted order:
Great. Now let’s turn our attention to the right subarray [5, 2] and divide it:
We have two subarrays of size one: [5] and [2]. Compare the two values and sort them (merge them) in a new array:
Great. We’ve got two sorted arrays. Let’s merge these together in a new array using the pointer technique like before:
2 is less than 4, so our new array will start with 2. The pointer at 2 will move on to the next value in that array:
4 is less than 5. We add 4 to our new array and move the pointer at 4 to the next value:
Our pointers are now on 6 and 5. 5 is less than 6 so we add 5 to our new array and move the pointer to its left:
The pointer in the right array is now pointing outside that array. Let’s add whatever is left in the left array to the new sorted array:
We’re done with the right side. Now we’ve got two sorted arrays and we want to get to the final sorted array. We’re going to use the pointer technique; this time on the two halves in green to get our final array:
2 is less than 3. We create a new array and place 2 as the first value. The pointer at 2 moves to the right:
Compare again: 3 is less than 4. We add 3 to the new array and move the pointer at 3 to the right.
4 is less than 6:
5 is less than 6:
6 and 6 are equal. Let’s put both values in our answer array and move both pointers forward:
Our pointer in the right array is now outside the array. Let’s add whatever is in our left array to our final array:
We now have a sorted array!
Time and Space Complexity
To sort an array with Merge Sort, you must continue to divide the problem into subproblems until your subproblems (your base case) are of size 1. Once you have reached that point, you need to merge those values into a new array. Continue to merge arrays back until you have a new array the same size as the original array but in sorted order.
When we split the original array into subarrays and eventually return a new array of size n. Our space complexity will be O(n) just to hold the values of the new array. We also have to take into account that all of the actions of splitting and merging that happen before we reach the final result. Each call to merge sort that causes whatever part we are working on to be split in two and make new arrays to sort those subarrays. We know that it takes O(n) to divide and eventually merge the full array. If you split that array in half, it will take O(n/2) to divide and merge that half and O(n/2) to divide and merge the other half. We continue to split those arrays as well O(n/4). If you add the summation of every level of dividing and merging, you will have an O(nlog(n)) space complexity for this solution.
For time complexity, you can look at your operations in the same way. At the first step of dividing the full array and the eventual merging all the values into a new array, it will take O(n).
After dividing the array in halves, each half will be n/2 in size and will take O(n/2). Dividing those halves into halves, will give you O(n/4), and so on. The summation of every level of division and merge will give you a time complexity of O(nlog(n)).
What if we wanted to sort in-place?
Well, we can do that too. There is another method to use an auxiliary array (a copy of the original array) to help with sorting in-place, which would give a solution with the same time complexity and O(n) auxiliary space. Another solution can give you O(1) space. I won’t cover these solutions here but you can have a look here and here to understand how they work.
Generally speaking, Merge Sort has a Time Complexity of O(nlog(n)) and a space complexity of O(n).
Code Implementation
Let’s look at Merge Sort in Python:
We’re going to implement Merge Sort using Recursion. As with any recursive algorithm, we need to set up a base case. In Merge Sort, the base case is that the length or size of the array is 1. All we want to do in our base case is return the array as is if it’s length is 1.
def mergeSort(array):
if len(array) == 1:
return array
If we are not dealing with a base case, we want to divide our array in half. The way we do this is by finding the midpoint. We use the //
operator in Python to give us the floor of whatever the midpoint is:
def mergeSort(array):
if len(array) == 1:
return array
midpoint = len(array) // 2
We then create our left and right subarrays:
def mergeSort(array):
if len(array) == 1:
return array
midpoint = len(array) // 2
leftHalf = array[ :midpoint]
rightHalf = array[midpoint: ]
We now want to call mergeSort
on the two halves and then merge them together. We’re going to use a helper method to do the merging. Let’s call mergeSort
on each half and pass the return values to our helper method mergeHelper
:
def mergeSort(array):
if len(array) == 1:
return array
midpoint = len(array) // 2
leftHalf = array[ :midpoint]
rightHalf = array[midpoint: ]
return mergeHelper(mergeSort(leftHalf), mergeSort(rightHalf))
let’s create our mergeHelper
method. We need a sorted array and its length will be the size of combining both halves together. We can set all the values in the sorted array as None
as a start:
def mergeHelper(leftHalf, rightHalf):
sortedArray = [None] * (len(leftHalf) + len(rightHalf))
We need to set up some pointers. Pointer i
will point at the start of the first half, pointer j
will start at the second half, and we have a third pointer, pointer k
, that will keep track of where we are in our sorted array. They all start at index 0 of their respective arrays:
def mergeHelper(leftHalf, rightHalf):
sortedArray = [None] * (len(leftHalf) + len(rightHalf))
i = j = k = 0
So while pointer i
and pointer j
are within their arrays, we want to compare the values that these pointers point to. Which ever value is less will be inserted into the sorted array and the pointer will move forward one step. We also move pointer k one step so we know what is the next value in the sorted array we are trying to fill.
def mergeHelper(leftHalf, rightHalf):
sortedArray = [None] * (len(leftHalf) + len(rightHalf))
i = j = k = 0
while i < len(leftHalf) and j < len(rightHalf):
if leftHalf[i] < rightHalf[j]:
sortedArray[k] = leftHalf[i]
i += 1
else:
sortedArray[k] = rightHalf[j]
j += 1
k += 1
Once the while loop ends, we would have reached the end of one of the arrays. What’s left is to add whatever remains in the array that we haven’t reached the end yet. Finally, we return our sorted array:
def mergeHelper(leftHalf, rightHalf):
sortedArray = [None] * (len(leftHalf) + len(rightHalf))
i = j = k = 0
while i < len(leftHalf) and j < len(rightHalf):
if leftHalf[i] < rightHalf[j]:
sortedArray[k] = leftHalf[i]
i += 1
else:
sortedArray[k] = rightHalf[j]
j += 1
k += 1
while i < len(leftHalf):
sortedArray[k] = leftHalf[i]
i += 1
k += 1 while i < len(rightHalf):
sortedArray[k] = rightHalf[j]
j+= 1
k += 1
return sortedArray
Here is our final mergeSort algorithm:
def mergeSort(array):
if len(array) == 1:
return array
midpoint = len(array) // 2
leftHalf = array[ :midpoint]
rightHalf = array[midpoint: ]
return mergeHelper(mergeSort(leftHalf), mergeSort(rightHalf))def mergeHelper(leftHalf, rightHalf):
sortedArray = [None] * (len(leftHalf) + len(rightHalf))
i = j = k = 0
while i < len(leftHalf) and j < len(rightHalf):
if leftHalf[i] < rightHalf[j]:
sortedArray[k] = leftHalf[i]
i += 1
else:
sortedArray[k] = rightHalf[j]
j += 1
k += 1
while i < len(leftHalf):
sortedArray[k] = leftHalf[i]
i += 1
k += 1while i < len(rightHalf):
sortedArray[k] = rightHalf[j]
j+= 1
k += 1
return sortedArray
And that is Merge Sort. It’s a very useful algorithm to sort values in an efficient way. It is key you understand it to be able to tackle other divide and conquer algorithms. Again, Merge Sort can be implemented in O(n) space and O(1) space and I encourage you to have a look at those implementations.
Thank you for reading!