Finally, a sorting algorithm that doesn't suck! We explore how merge sort works and why it performs better than insertion, bubble, and selection sort.
This episode of the Basecs podcast is based on Vaidehi's blog post, Making Sense of Merge Sort from her basecs blog series.
[00:00:02] (Music) Welcome to the Base.cs Podcast where we explore the basics of computer science concepts. I’m your host Saron, founder of CodeNewbie.
[00:00:09] VJ: And I’m Vaidehi Joshi, Author and Developer.
[00:00:11] SY: And she is the brilliant mind behind the Base.cs Blog Series. Today we’re talking about.
[00:00:16] VJ: Merge Sort.
[00:00:19] SY: This season of the Base.cs Podcast is brought to you by our wonderful friends at DigitalOcean. DigitalOcean provides the easiest cloud platform to deploy, manage, and scale applications of any size. They remove infrastructure friction and provide predictability so you can spend more time building what you love. Try DigitalOcean for free by going to DO.co/codenewbie and get $100 in infrastructure credit. Link is in your show notes.
[00:00:47] This season is also brought to you by the fine folks at Linode. If you’ve got a personal project, a small business or a big business with lots of data, Linode offers you secure hosting for all your infrastructure needs. They are a Linux cloud hosting provider where you can get a new server up and running in under a minute. Plans start at one gigabytes of RAM for just five bucks a month. And with the promo code CodeNewbie2019, you can get a $20 credit. So go to linode.com/codenewbie and give it a try. Also, they’re hiring. Check out their jobs at linode.com/careers. Link is in your show notes.
[00:01:25] SY: Okay, let’s get started. Okay. So we’ve talked about a bunch of different sorting algorithms. In the last couple episodes, we talked about insertion sort, selection sort, bubble sort. And even though they have three different names, they all have… why are you already laughing? Because you know what I’m going to say. They all suck.
[00:01:47] VJ: I know where this is going.
[00:01:48] SY: They all suck is where this is going. And we can quantify their suckage because they’re Big O notation. Their time complexity is quadratic, which is like the ultimate of suckies.
[00:02:01] VJ: Yeah. It’s pretty bad.
[00:02:02] SY: It’s really bad. It’s horrible. So today I’m hoping that merge sort, which is the newest addition to the sorting algorithm team, is going to be a different story. Is it going to be a different story, Vaidehi?
[00:02:14] VJ: What are you going to do if I say no?
[00:02:16] SY: I’m just sick and tired. Okay? That’s all, I am sick and I’m tired. Okay?
[00:02:19] VJ: This episode is just going to be Saron being like, “Let me tell you, all the ways these algorithms are terrible.”
[00:02:25] SY: Yes. There’s just one. It’s called, “They suck.”
[00:02:30] VJ: Well, the good news is it is not going to be quite the same as the other algorithms we’ve talked about.
[00:02:35] SY: Thanked God.
[00:02:36] VJ: It’s kind of like a new one for us because it is going to be the first non-suckage algorithm.
[00:02:42] SY: Great. Okay. So this is the non-suckage algorithm called merge sort. Okay. So what is merge sort? I think we’re still sorting things. I assume we’re merging things. What is this about?
[00:02:52] VJ: Yes, you’re right. We are going to sort things and we’re going to merge them. A merge sort algorithm is pretty unique in the way that it goes about sorting and unsorted collection. It splits an unsorted set into two halves and then it sorts the two halves and then it merges them back together to form one sorted collection and it uses recursion to do that.
[00:03:19] SY: Oh, I don’t think we have talked about recursion in the context of sorting algorithms, right?
[00:03:25] VJ: You’re right.
[00:03:26] SY: Okay. So that whole thing is like a new thing that we’re going to talk about today. Okay. Cool. Okay. So you said we’re going to split them in halves and then we’re going to merge them and then they have to do with recursion. Okay. Let’s start with, maybe we just start with the splitting. What exactly are we splitting?
[00:03:40] VJ: It’s this idea actually that is not even unique to merge sort. It’s part of like a design paradigm for algorithms and that just basically means it’s like a pattern and it ties into something called “divide and conquer” and the splitting that merge sort does is basically dividing a large problem, which is a large data set that you’re trying to sort, and splitting it up into smaller sub problems and then solving the sub problems and figuring out how to solve, you know, the mini versions of the problem and then applying that same solution to the larger problem.
[00:04:16] SY: Okay. So I get this idea of we’re going to break it up into little pieces. And you know it kind of, it makes sense. Like if I’m thinking about my laundry, I have like my big pile of clothes that I need to you know, fold or sort out and one of the things that I’ll do is, oh my God, I’ve been merge sorting for years, I’ll divide them into two pieces first, like two groups because it just feels easier.
[00:04:39] VJ: Lights and darks.
[00:04:39] SY: Yeah, lights and darks, yeah, for sure, and then also like lingerie and undergarments, I think, is the more…
[00:04:46] VJ: Delicates.
[00:04:47] SY: Delicates. There we go. Delicates. And then I have my other sort, which is like things that go on hangers, first things that are actually folded. So I kind of like roughly group them into these parts and then I’ll like go in with my folding, like I’ll like go in on one section at a time. So is that like kind of what we’re talking about when we talk about dividing and then applying things?
[00:05:11] VJ: It’s kind of similar except that instead of breaking things down by, I guess, category or genre, in the case of sorting when you’re working with sorting algorithms, you’re probably sorting some subset that’s similar because you need something to sort by. We’ve talked about this at previous episodes. But like let’s say for example numbers, we want to basically divide our data set into the smallest possible categories.
[00:05:35] SY: Okay.
[00:05:35] VJ: So in the case of numbers, what you basically are doing is you’re dividing a set of numbers down into their smallest parts.
[00:05:43] SY: Okay.
[00:05:44] VJ: What’s the smallest sorted subset of a group of numbers? Well, it’s just an individual number. So for example like if you have like eight numbers and you’re like, “Okay, I’m going to divide it down.” You can divide eight numbers into two groups of four and then you could divide each of those from four down into two and then you can divide pairs into their individual parts, and there’s nothing left to divide because it’s just one number. Well, that’s sorted by default. There’s nothing to compare it to. So now you’ve got your smallest sorted subset.
[00:06:14] SY: Little units. Yeah. It’s weird to think of a set as just having one number in it because you think of set as like having lots of stuff, but you can have a set that is only one thing.
[00:06:27] VJ: Yes. And actually that’s probably my favorite kind of set because it’s already sorted by default.
[00:06:32] SY: Oh!
[00:06:33] VJ: Yeah,
[00:06:34] SY: Look at that. We’re basically done. We’re basically right at the end of this. Okay. So we talked about dividing and conquering. So we talked about the dividing parts. We take for example our list of eight numbers, break it down to four each, then break each for down to two-two then one-one. So we have our individual units. Okay, now that’s the dividing, now the conquering. How are we conquering these numbers?
[00:07:01] VJ: Well, once you’ve broken everything down into your, let’s say, eight numbers right, you’re going to end up after your base case. Your most divided case is going to be eight sorted list of one number each.
[00:07:12] SY: Yeah.
[00:07:13] VJ: Well, the conquering is basically the way that we’re going to do the sorting and we’re going to conquer. We’re going to sort our data set just a little piece at a time. So we’re going to solve the sub problems first and then we’ll build our way up into one mega problem that we’ve sold. And when I say solve the problem, what I’m really talking about is sorting because that is the problem we’re trying to solve. So what we want to do is we’re going to sort a few items at a time because we have you know, eight elements that are sorted already, we want to combine them back together and sort them, and this is where the merge comes in. The conquering is really the merging of elements together. So we’re going to take our eight sorted elements, which is our eighth sorted individual numbers, our eight sorted lists, and we’re going to merge them so that we have four sorted list.
[00:08:07] SY: Okay. I feel like eight numbers is maybe a little too much to keep in my head. Can we do like just four numbers?
[00:08:12] VJ: Yeah.
[00:08:12] SY: Let’s start with a list of four. Okay.
[00:08:14] VJ: Yeah.
[00:08:14] SY: So if we have four numbers, let’s just do like a reverse list, like 4, 3, 2, 1.
[00:08:19] VJ: Sure.
[00:08:20] SY: So if we run through the divide step, we’ll break that up into two groups so we have 4, 3 and then 2, 1, and then we break those sub lists up even further and now we have a list of four, list three, list with two, list with one. So we have one number per list and we have four lists.
[00:08:40] VJ: Each of these lists remember are sorted. That’s important.
[00:08:42] SY: And each of them are individually sorted. So 4 is sorted. It knows who it is. It knows where it needs to be. It is sorted within itself.
[00:08:50] VJ: It’s exactly where it needs to be.
[00:08:51] SY: It’s exactly where it needs to be. Four always knows what’s up. Okay. So we just did the dividing. So we have our 4, our 3, our 2, and our 1. So now when we talk about merging them like one bit at a time, what does that mean? What do we do now?
[00:09:09] VJ: Well, we’re going to slowly build up our set as we sort it. So we’re not going to try to put 4, 3, 2, 1 in the correct order all at once.
[00:09:18] SY: No.
[00:09:19] VJ: Instead we’re just going to take two lists. So right now we’re just talking about two individual numbers and we’re going to merge them together.
[00:09:25] SY: Okay.
[00:09:27] VJ: So let’s just focus on the first two. We have two sorted lists and we want to merge them and sort them at the same time. So in this case, we have the number 4, which is one list, and we have the number 3, which is another list.
[00:09:39] SY: Okay.
[00:09:40] VJ: We’re going to take the two and we’re going to say, “Okay, I’m going to look at the first element in each list and I’m going to compare them and based on whatever their value is I’m going to create a new list, which is going to be their new merged home, and I’m going to put them in the right place in that list.
[00:09:56] SY: Oh, okay.
[00:09:57] VJ: So in this case, 3 is smaller than 4. So I would say, “Okay, I have 4 and I have 3. Well, 3 is smaller. I’m going to put 3 of the front of my new list and 4 behind it.” So now I have a sorted new merged list, which is 3, 4.
[00:10:12] SY: Okay. Can I do the next one?
[00:10:14] VJ: Yeah, go for it.
[00:10:15] SY: So now that we have merged and sorted the list 4 and the list 3, we have two little sub lists left which are list 2 and list 1. So I’m going to take these two individual already sorted lists and we’re going to compare what’s in them and put them in a brand new array. So I’m going to compare the 2 and the 1. My 1 is smaller than my 2. So in my new array, I’m going to first insert the 1 and then after that I’m going to insert the 2. So now I have a brand new array that reads 1, 2.
[00:10:54] VJ: Yup.
[00:10:56] SY: Oh, Look at me. sortin’ and mergin’, sortin’ and mergin’.
[00:11:01] VJ: And now we have two lists. Remember we started with four sorted list?
[00:11:03] SY: That’s true.
[00:11:04] VJ: Now we have two sorted lists. And in each of them, if we just look at the list individually, the elements are sorted and we’ve gone from half as many lists. You can assume that the next thing we’ll do is convert it into one list because we’re merging everything and we’re instead of dividing everything we’re combining it.
[00:11:23] SY: We’re kind of reversing that we’re like sorting as we go.
[00:11:25] VJ: Yes, exactly.
[00:11:26] SY: Okay. So now I have two lists and each list is sorted. So the first list I have is 3, 4, the second list I have is 1, 2. Now how do we bring those two together?
[00:11:41] VJ: Well, you’re going to apply the same exact technique that we just did for the small problems, the sub list. I think once we start stepping through it’ll become clear that we are just repeating the same steps.
[00:11:52] SY: I will be honest. What I want to do I want to smoosh them together because I’m looking at it and I’m like, “Okay, there’s a 1 and a 2 and the next one is the 3 and 4. Obviously, you just put the 3 in the 4, glue it onto the 1 and the 2 and you got a 1, 2, 3, 4.
[00:12:06] VJ: Well, what if it was 1 to 5 and then you would have 1, 2, 5, 3, 4?
[00:12:10] SY: Oh, no, no, no, that would not work. Okay. So no smooshing?
[00:12:13] VJ: Yeah. We can’t just smoosh. Well, what we want to do is we just want to reapply the same methods that we’ve been doing. So we know that we’re going to combine these two lists that are sorted and what we did before is we created like a new little list where we were going to sort things accordingly and that was going to be the new home for all the numbers.
[00:12:33] SY: Oh, we’re building them little houses they go.
[00:12:35] VJ: Yeah.
[00:12:35] SY: I like that. Okay, Okay.
[00:12:37] VJ: So we’re going to create like a temporary list. It’s the home where all these numbers are going to go.
[00:12:43] SY: Okay.
[00:12:43] VJ: So since we’ve had two lists of two numbers each, this time it’s going to be one list that’s going to house four numbers. So we’re going to look at both of our list and previously we only had one number in there to compare so we would just kind of be like, “Okay, I could just look at the two and compare.” But if you remember, what I actually said is I was going to look at the first number in the list. It just so happened that there was only one number.
[00:13:06] SY: You did say that.
[00:13:07] VJ: In this case, we have more than one number. So we still need to look at the first number in the list. So in this situation, we’re going to look at the left list and the right list.
[00:13:15] SY: Okay.
[00:13:16] VJ: In the left list, we have 3, 4. In the right list, we have 1, 2. So we’re going to look at the first number of each. So in the left list, if we imagine like a pointer or a little sticky note, the number that I’m looking at right now, the number in question is 3 and in the right list the number in question is 1. Which one is smaller? I’ve got two lists and I’m looking the first number. So whichever one is the first here has to be the first of all the numbers here.
[00:13:40] SY: That’s true. Yeah. Yeah. Yeah. Because they’re already sorted.
[00:13:43] VJ: Yeah. We’re operating under the assumption that it’s sorted and now we’ve got so much more like control because there are certain things we can assume to be true and they will be. So in this case, we’re like, “Okay, 3 and 1. My pointer is at 3 and my pointer is that 1. Well, 1 is smaller. So I know 1 is going to be the first element in my new temporary array house list thing.
[00:14:03] SY: Okay. That makes sense.
[00:14:06] VJ: I still don’t know what to do with 3. I still know 3 is at the front of that list and I know it’s bigger than 1, but that’s not that much to go on. So I’m just going to keep my pointer at 3 for now.
[00:14:16] SY: That’s true. Okay. Okay.
[00:14:17] VJ: But I have another element in my second list.
[00:14:20] SY: Right, in that right list.
[00:14:22] VJ: You are right. It is the right. So now I have a pointer at my right list and I’m looking at the next number, which is the number 2. So I’m like, “Okay, 3 in my left list, 2 in my right list.”
[00:14:37] SY: Right, because you’re still staying on 3 because you haven’t done anything with it yet.
[00:14:42] VJ: Yeah. I didn’t know where to sort it. I didn’t find the right place. I knew not to put it in first because 1 had to go first. But I was like, “All right, I don’t know what to do with you so I’m just going to keep holding on to you for a second till I find your right home in this array.” So now I’ve incremented over to the 2 and now I’m like, “Okay, well, I’m looking at the next number. Is 2 smaller than or greater than 3? Well, it’s smaller than 3 and it’s greater than 1 so it goes next in the list.”
[00:15:07] SY: So in that new home that we’re building, we have 1 then we have 2, and back in our left array, we’re still pointing at 3 and we have a 4 there too by the way. We haven’t even gotten to 4 yet. And then in my right array, that’s empty because we’ve already moved the 1 and the 2 out. They’ve already moved out and they’re in their new big home already.
[00:15:33] VJ: Yup.
[00:15:34] SY: Okay. Okay.
[00:15:35] VJ: So you might be able to even see what happens next.
[00:15:36] SY: Yeah, because the left array is already sorted. It’s how we ended up with that array to begin with. So we can just stick the 3 after the 2 and then stick the 4 after the 3, like we don’t even really have to do any comparing because we already know that they’re sorted.
[00:15:52] VJ: Yup. That’s true.
[00:15:52] SY: Nice. That feels so much easier than what we were doing for the last three episodes. Because with the last three algorithms, I feel like we were just repeating so much and just going back and forth and just re-shifting and like just shifting it over.
[00:16:15] VJ: I think we spent like a good 12 minutes one episode just going through one iteration and then we’re not even done.
[00:16:20] SY: But we’re done now, like we’re actually fully done.
[00:16:25] VJ: Yeah. We have a sorted array and we had four unsorted elements. What did we have to do really?
[00:16:32] SY: Really nothing. We basically just hung out. We made some homes. We have some lemonade.
[00:16:36] VJ: We have laundry. Well, all we really had to do is divide everything up to its smallest part, which wasn’t that hard.
[00:16:43] SY: Yeah.
[00:16:44] VJ: And then we did three things. Like we, we took, Oh, no, we did only two things. Sorry. We did two things.
[00:16:50] SY: Yeah.
[00:16:52] VJ: So we had four elements and we did two steps. So we did two merges. We merged four elements into two lists and we sorted as we did that and then we turned two lists into one list and we sorted as we did that.
[00:17:05] SY: That’s true.
[00:17:06] VJ: So we did two merges for four elements.
[00:17:08] SY: We multitask very well. We did not multitask at all in those stupid algorithms.
[00:17:14] VJ: No. We spent an inordinate amount of time doing I feel like simple things. You know why?
[00:17:22] SY: Yes.
[00:17:23] VJ: It’s because we did like this like loop within a loop and we are like, “First let me go through the whole list and then I’m going to look at the first element and then for the first element, I’m going to compare it to all the other elements. All right. Now I need to go to the next element because I’m going through the whole list and now I’m going to compare the second element to all the other elements,” and that was a nested loop and it was bad and it was not fun and like let’s just delete those three episodes from our memory.
[00:17:47] SY: Yeah. And you know what? It also felt simple as one way to describe it but also like boring is another one, it felt like it was like an uninteresting activity where this it’s like, “Oh, I have my two pointers, I’m going to stay at one, I’m going to move the other one.” It just felt like a more fun a more hip, like the cool aunt you know. It just felt cooler.
[00:18:12] VJ: Yeah. I think it’s the recursion. I think that’s what recursion does to you, like you can smell it from afar where you’re like, “Is that recursion I smell? Recursion has been in this room.”
[00:18:24] SY: It’s coming! Yay! But yeah, because you did mention earlier like at the very top of the episode you said, “Oh, it’s going to be recursive.” So like that’s going to help and I guess this is where it comes in, this whole like we’re dividing and conquering and dividing and then merging and sorting and merging, like it feels like we’re doing the same steps within itself, if that makes sense.
[00:18:45] VJ: Yeah, and if you’re coding this too, like you would actually end up repeating the same steps whether the list was eight elements or four elements or even if you get down to the base single element per list and you’re down to the sorted element and you’re like, “Oh, I’m going to merge two elements together. That’s easy.” Well, it is easy. It’s not that difficult. And so if you can do it with two elements, then you can also do it with two elements that are you know, six sorted numbers because all you’re really doing is you’re dealing with lists that you know are sorted.
[00:19:19] SY: Yeah.
[00:19:20] VJ: It just so happens that the list that you know is sorted at the base case is one number, but you can apply the same formula. So if you were writing this and you were implementing it in code, you’d probably have like a merge sort function right? And that would do the job of like, “Oh, I’m going to take this input array and I’m going to split it and I’m going to do the job of divide, divide, divide, divide, divide all the way down recursively into my smallest little bit.” And then you’d probably have a merge function whose only job is to combine and it’s going to build it up, build it up, build it up, and sort while it goes along and does that.
[00:19:54] SY: Yeah.
[00:19:56] VJ: So you’d have a merge sort function that calls merge sort from within itself because all it’s doing is it’s just doing the same steps whether or not it’s dealing with a single element sorted list or a sorted list 12 different numbers. It doesn’t really make a difference. It’s doing the same steps. So if we walk through an example of four numbers, but if we walk through an example with eight numbers, you would have done the same thing.
[00:20:22] SY: Yeah, broke it up all the way down to its pieces and then kept pairing off the broken down pieces until we got all the way up to our one new list and like our one new home for our numbers.
[00:20:32] VJ: Exactly.
[00:20:33] SY: Okay. Interesting. Okay. So I assume because of all of this recursion and because I like this algorithm better, I’m going to assume that this is not… the time complexity is not quadratic, right?
[00:20:46] VJ: Yeah. So you’re right. That’s like also part of the reason that I think you have a better feeling about this algorithm where you were like, “Oh, that didn’t take that long and we didn’t spend the whole episode just going over one loop, one iteration through the set of elements.” It was more efficient and the time complexity of this algorithm reflects that too. Like if you start thinking about how many steps we actually took and if you doubled the number of elements, how many steps we’d actually have to take in that situation? This algorithm ends up being linearithmic. Another way you can call it is like O(n log n).
[00:21:28] SY: Oh, that sounds complicated.
[00:21:29] VJ: It sounds complicated, but we can basically think of it as like not quite linear, a little bit slower than linear and not as fast as logarithmic because we’ll remember that logarithmic is faster than linear. So we call the Big O of this algorithm linearithmic, which we can also refer to it as O(n log n). I know it sounds weird, but basically all it means is it’s a combination of linear time and logarithmic time is how I kind of think about it. It’s not quite logarithmic because we know logarithmic is pretty darn good. If you mapped logarithmic and quadratic, that kind of the opposites. So for as much as we hate quadratic, logarithmic is pretty good.
[00:22:15] SY: And quadratic is a nice, not a nice, a terrible, a horrific U-shape.
[00:22:20] VJ: The worst.
[00:22:22] SY: The worst shape ever.
[00:22:23] VJ: You see that U? You better run away.
[00:22:27] SY: There you go. There you go.
[00:22:30] VJ: So it is not quadratic. It’s kind of like the reflection of quadratic if you graphed it. So logarithmic is good. It’s good if you can try to achieve a logarithmic runtime. Linear is slower than logarithmic, and if you visualize it, you can just basically imagine that as you have more elements, it takes linearly more time or space whatever you’re talking about. So linear is slower than logarithmic and linearithmic is slower than linear and logarithmic, but it’s not like that bad.
[00:23:04] SY: Okay.
[00:23:05] VJ: It’s not that much worse. Quadratic is like, “Oh my God, you were off in the Sahara Desert somewhere.”
[00:23:09] SY: Okay. Got it. So technically if we go from worst to best, quadratic would be the worst. So linearithmic is technically the second worst, but it’s not really that bad because on a graph it looks kind of pretty close to linear is what we’re saying.
[00:23:30] VJ: Yes, but I do want to point out that it’s only the second worst to us because it’s not like we’ve gone through every single type of Big O notation so far.
[00:23:40] SY: Okay. Fair, fair, fair.
[00:23:42] VJ: There are definitely things that are worse than linearithmic.
[00:23:44] SY: We just haven’t talked about it yet.
[00:23:46] SY: Yeah, we haven’t talked about. So we’ve talked about quadratic Big O notation, we’ve talked about logarithmic, we’ve talked about linear, and linearithmic time is worse. It’s not as bad. And depending on the type of linearithmic time, sometimes it can be like a little bit slow. Sometimes it can be kind of like, “Oh, that’s like fair. It’s fine.”
[00:24:09] VJ: The one thing I do want to say is that it is an improvement from quadratic which was very bad. It is not as good as linear and logarithmic which we know to be pretty darn excellent.
[00:24:18] SY: Okay. So we talked about the shape of linearithmic and how it compares to quadratic. And actually if you read the blog post, you can get even more in-depth and talk about why it is n times log of n and where that log of n comes from. So if you’re interested in that part of things, make sure to check out Vaidehi’s blog post.
[00:24:37] VJ: And if you need a little bit of a refresher on some other math stuff that we’ve talked about in this show specifically logarithms, we did an episode on that back in Season 2. I think it’s Season 2, Episode 7. And if you want to know why we keep talking about logarithms and what those exactly are, they tie into the math behind the Big O notation of this algorithm. You can go back and refresh your memory by listening to that episode.
[00:25:01] SY: And that’s the end of today’s show. If you liked what you heard, please leave us a review and make sure to check out Vaidehi’s blog post. Link is in your show notes. Also want to give a huge thank you to Linode for sponsoring the show. Try their powerful servers built for all your infrastructure needs. Getting started with a shiny new server takes less than one minute so you can get up and running in no time. Get a $20 credit by using the promo code CodeNewbie2019. Just go to linode.com/codenewbie for more info. Also want to give a huge shout-out to DigitalOcean. They are the easiest way to deploy, manage, and scale your application. Everything about it was built with simplicity at the forefront; setting, deploying, even billing. Try DigitalOcean for free by going to DO.co/codenewbie and get $100 in infrastructure credit. Link is in your show notes. Vaidehi, you want to say goodbye?
[00:25:53] VJ: Bye everyone!
[00:25:54] SY: Thanks for listening. See you next week.
Thank you for supporting the show!