We dig into how insertion sort works, how we know where to do our inserting, and how this sorting algorithm performs, all with the help of our new boos.
This episode of the Basecs podcast is based on Vaidehi's blog post, Inching Towards Insertion Sort from her basecs blog series.
[00:00:00] SY: Okay. I’ve got a very exciting announcement. We finally have a date for Codeland, our annual conference. It’s happening July 22nd in New York City. Early bird tickets are now on sale. For just 99 bucks, you get breakfast, lunch, dinner, a coding workshop, technical talks and after-party, and of course an amazing community. Sale ends March 10th. Go to codelandconf.com for more info. Link is in your show notes.
[00:00:29] (Music) Welcome to the Base.cs Podcast where we explore the basics of computer science concepts. I’m your host Saron, founder of CodeNewbie.
[00:00:36] VJ: And I’m Vaidehi Joshi, Author and Developer.
[00:00:39] SY: And she is the brilliant mind behind the Base.cs Blog Series. Today we’re talking about.
[00:00:44] VJ: Insertion Sort.
[00:00:46] SY: This season of the Base.cs Podcast is brought to you by our wonderful friends at DigitalOcean. DigitalOcean provides the easiest cloud platform to deploy, manage, and scale applications of any size. They remove infrastructure friction and provide predictability so you can spend more time building what you love. Try DigitalOcean for free by going to DO.co/codenewbie and get $100 in infrastructure credit. Link is in your show notes.
[00:01:15] This season is also brought to you by the fine folks at Linode. If you’ve got a personal project, a small business or a big business with lots of data, Linode offers you secure hosting for all your infrastructure needs. They are a Linux cloud hosting provider where you can get a new server up and running in under a minute. Plans start at one gigabytes of RAM for just five bucks a month. And with the promo code CodeNewbie2019, you can get a $20 credit. So go to linode.com/codenewbie and give it a try. Also, they’re hiring. Check out their jobs at linode.com/careers. Link is in your show notes.
[00:01:52] SY: Okay, let’s get started. Oh, this is our third sorting algorithm?
[00:01:59] VJ: It is.
[00:02:00] SY: I believe it is. Okay. So insertion sort, I assume it does some inserting. Otherwise, that would be just really mean.
[00:02:10] VJ: What if I was just like lazy and I was like, “You sort it yourself”?
[00:02:15] SY: Like, “We had a deal. Okay?” Okay. But what is it? How do we think about it? It’s inserting somehow, doing some type of sorting, how does that work?
[00:02:23] VJ: Yes. So it’s basically a sorting algorithm that kind of has two parts to it, and by two parts what I really mean is it has two subsections or sub lists and the sorting algorithm sort of considers one section to be the sorted elements and the other section to be the unsorted elements.
[00:02:42] SY: Okay.
[00:02:43] VJ: So what it does is it iterates through a collection, the unsorted collection, and one by one it inserts the unsorted elements into their correct location in the sorted set list. So that’s kind of where the insertion name comes from where it’s like, “I’m going to pluck this and insert it in the right place. I’m going to pluck this and insert it in its right place.”
[00:03:03] SY: Okay. So that sounds kind of like what we were doing in the first sorting algorithm that we learned, which is selection sort back in Episode 1. How is this different from that? Because we were like shifting and putting things in order in that one too, right? What’s the difference?
[00:03:20] VJ: Yes. So I think a good way to kind of remember the difference between the two is in the name. So in selection sort, we were selecting elements based on one characteristic which was that we were selecting the smallest element and putting it aside. Like we’re always pulling the smallest element from the unsorted half. In selection sort, there’s like kind of an implicit divide between unsorted and sorted elements, but it’s not necessarily as obvious versus in insertion sort, the whole thing is like, “Oh, I know what I’m doing. I know I have two sub lists.”
[00:03:56] SY: Okay.
[00:03:57] VJ: In insertion sort, however, you’re not selecting the smallest element the way you are with selection sort, rather in insertion sort you’re kind of systematically going through the unsorted collection and you’re plucking one and inserting it into its right spot. So you’re not scanning the whole list and saying, “Let me find the smallest.” You’re just saying, “Okay, let me grab the first one and I’m going to figure out where it goes in the sorted collection.”
[00:04:23] SY: Okay. So it feels like a keyword in the insertion sort and maybe the differentiator one of the things between that and the selection sort is the idea of shifting things around. Is that word as important as it feels like it is?
[00:04:41] VJ: You could say that they’re both shifting elements, but I guess if you’re visually thinking about it, you could say that the selection sort algorithm is being more focused on what it’s selecting from the unsorted list versus the insertion sort algorithm is more focused on where to insert the item into the sorted section of it.
[00:05:05] SY: Okay. So it sounds like there’s kind of two parts to this insertion sort that are important. One is having that structure, like having the fact that there is a sorted subset and an unsorted subset, and then the second one is the actual process of iterating through and putting it in the right place.
[00:05:21] VJ: Yes.
[00:05:22] SY: Is that fair?
[00:05:23] VJ: Yes. Those are the two main parts, I would say, I think you’re right about that.
[00:05:26] SY: Okay. Example time.
[00:05:28] VJ: Example time.
[00:05:29] SY: Yay! Okay. So I have a list of boos as I always do.
[00:05:35] VJ: Oh, no! Not this again. It’s so emotionally distressing for me. I just have to watch all these numbers get broken up with and I feel like I’m supposed to not be partial. It’s so hard.
[00:05:47] SY: As long as I am not the dumpy it’s fine. I just always have to be the dumper. Okay. So by the way, I am married and Rob I’m sorry. Okay.
[00:05:59] VJ: She’s leaving you for the number 9. I’m sorry, I had to be the one to tell you and in this format. I couldn’t handle it anymore. This is how we’re telling you Rob on the podcast.
[00:06:14] SY: Oh my gosh! He actually listens to the podcast. It’s actually funny. This is like the one episode he didn’t listen to. He just never knew. He never knew. Okay. So my list of boos, 4, 3, 8, 1. Clearly, I’m over the number 9 so number 9 is not in this list. I’m done with 9. It was the last episode. So now we’re on 4, 3, 8, 1. Okay. So according to my algorithm I assume I would start at the first number.
[00:06:43] VJ: Yes.
[00:06:44] SY: Okay.
[00:06:44] VJ: Remember, you’re not going to do this whole thing where you’re going to find the smallest. You’re just going to systemically start at the beginning. Yeah. And initially you could even think of it as like the first element that you start with because you have nothing to start with is your sorted element.
[00:06:58] SY: Oh, okay. Okay. So back to that point earlier about how structure is important and there’s like a sorted subset, an unsorted subset because the first one is just the first one. I’m just going to say you were the beginning of my sorted subset.
[00:07:11] VJ: Yes.
[00:07:12] SY: Is that fair?
[00:07:13] VJ: Yeah, that’s totally right, and you could even think of it as like you’re like, “Okay, I’ve got my first element. I’m going to say, you could even visually think of it as like two separate arrays and you’re just going to say, “Well, an index 0, I only have one element so I guess by default it’s sorted.”
[00:07:27] SY: Yeah. Yeah. Okay.
[00:07:28] VJ: It’s the sorted element, the only sorted element in my sorted sub list.
[00:07:32] SY: Okay, cool. I like that. Okay. So I am done with 4 for now. I’m done with my 4 boo. Now I’m going to move on to 3. So next we have number 3. So I’m going to take three and I want to put it, I want to sort it and I’ve already established that I have my two parts, I have my sorted subset which is only the number 4 so far and then I have my unsorted subset which is my 3, 8, 1. So I want to move the 3 into the sorted subset, but first I have to decide where to put it. On the left or on the right of 4? So that’s when we compare it, right? That’s the first time that we are comparing two numbers and we’re saying, “Is 3 smaller or larger than 4?”
[00:08:15] VJ: Totally. As you mentioned, that’s the first time you compare and that’s where you can rearrange things really, that’s where you’re going to move 3 into its correct spot. So for example say you have an array with 4 at index 0.
[00:08:28] SY: Yeah.
[00:08:29] VJ: Well, now you know that 3 actually needs to be at the front of the list. There’s only two elements. So maybe you would move 4 to index 1 and put 3 at index 0.
[00:08:36] SY: Okay.
[00:08:37] VJ: That’s the part where the algorithm figures out where to put the element in the correct place because we said it’s just going to put it in the correct place in the sorted subset. How does it do that? Well, it just moves things around as it needs to.
[00:08:50] SY: It compares them. stop them. Yeah. Okay. Okay. So I take my 3, I compare it to my 4, I know that my 3 is less than my 4 so I move it to the, or I, I’m sorry, I insert it in the place that is to the left of my 4. So at the end of that, I have my sorted subset which now reads 3, 4 and then my unsorted subset which is still 8, 1.
[00:09:17] VJ: Correct.
[00:09:18] SY: That’s not that bad.
[00:09:19] VJ: Yeah.
[00:09:20] SY: we’re doing fine. We’re doing great.
[00:09:21] VJ: And you can kind of see, “Oh, I only have two more left,” and you can kind of have a feeling of what’s going to happen next.
[00:09:26] SY: Okay. So next we are going to take the next number in our unsorted subset, which is 8. So now when I want to put it in my sorted subset, but again, I have to figure out where exactly I’m going to put it and now I have… well, I have like three places I could put it, right? I could put it before the 3, I could put it between the 3 and the 4 or I could put it after the 4.
[00:09:51] VJ: Yup.
[00:09:52] SY: So how do I know which one? Do I just start at the beginning of my sorted list?
[00:10:01] VJ: What you’ll do is you’ll iterate through the sorted items with the current unsorted item in mind. So, you’ll iterate through, I guess right now we have 3 and 4, so you’ll iterate through 3 first and then 4 and you’ll say, “Oh, if this my current item is smaller than the one that I’m looking at, well, move it to the left, sorry, insert it to the left of it. If it’s larger, then move on to the next one.” So in this case, you would say, “Oh, it’s not smaller than 3, 8 is not smaller than 3 so move on to the next one. It’s not smaller than 4, so well, that’s the end of that.” I’m going to insert it to the adjacent index in the array.
[00:10:39] SY: Okay. So I’m taking my 8 and I’m being friends with it. We’re going out on a little date and then I’m comparing it to 3, which is the first number in my sorted subset and then because it is bigger, I’m going to compare it to my 4 which is the next item in my sorted subset and I’m going to say, “Okay, 8 is bigger than 4 so I’m going to insert 8 to the place after 4.”
[00:11:04] VJ: Correct.
[00:11:04] SY: Okay. So if we look at the whole array, both the sorted part and the unsorted part, the whole thing right now, it reads 3, 4, 8, 1.
[00:11:14] VJ: Yup.
[00:11:14] SY: With 3, 4, 8 being part of my sorted subset and only just one more thing, the number 1, in my unsorted part. Is that right?
[00:11:22] VJ: Yup. That’s right.
[00:11:23] SY: Okay. So last one, one more to go. We take number 1, we’re going to be really good friends with number 1, and we’re going to go out on a little date and we’re going to bring it to the very first number in the sorted portion, which is the number 3, and I’m going to say, “Is 1 smaller or bigger than 3?” It is smaller. So then that’s great. It doesn’t have to know about 4 and 8.
[00:11:49] VJ: Yup. So all you have to do is shift over the other items. You have to shift over 3, 4, and 8 and insert 1 and 2 to make the first spot.
[00:11:57] SY: To make some room.
[00:11:58] VJ: To make some room. I like that.
[00:12:00] SY: Yeah. And now because my unsorted subset is, well, it’s empty, right? I don’t have anything in it anymore. Is that how I know that I’m done?
[00:12:08] VJ: Yes. You finished your whole iteration through your array of unsorted items and what you’ve really done is you’ve just shifted them all around.
[00:12:19] SY: Just moving them around.
[00:12:20] VJ: Yeah. It’s not like you were just like you have two different buckets. It’s kind of just like they’re all together and you are like, “Okay, well, I’m going to imagine that this is where my sorting starts.” It’s all kind of being sorted in the same place, but the insertion sort kind of has this own vision of what is sorted and what’s not sorted.
[00:12:38] SY: Right. Right. And its mind there are two but really there are one.
[00:12:43] VJ: Yes.
[00:12:44] SY: That was deep. That was like deep somehow, right, maybe, just a little? Okay.
[00:12:48] VJ: Insertion sort of Zen.
[00:12:50] SY: It’s very Zen. Yes. Okay. So we end up with our sorted list, which is 1, 3, 4, 8. I don’t know if it was actually faster, but it felt faster. I don’t know why it felt faster. How many iterations do we generally need then to go through four items?
[00:13:05] VJ: Well, I know feels fast because we had only four elements. And I think you have fun whenever you go through these algorithms on this podcast. When you’re having fun time flies. The truth is, is that in order to implement this algorithm you’re having to do two things, really, two main things. You’re having to iterate through the whole subset right, because even when you start with everything being unsorted you start with the first element and then you’re like, “I’m going to mark it as sorted.” And then you go to the next element, then you’re like, “I’m going to insert it to its right place.” And then you go to the next element, you’re like I’m going to insert it to its right place.” So you’re going to do that for all of the elements in the whole collection. You’re iterating from left to right, but there’s another thing that you’re doing, which is that. When you pull an item out of the unsorted subset and you figure out where it should live in the sorted subset you are going to have to iterate through that sorted subset also because you are going to have to do like a check for every single sorted item and be like, “Do you live here? Do you live here? Oh, you live here. I’m going to insert you and shift everyone over and insert you into your right spot.” So you’re doing two iterations, which is basically a nested loop. So the first iteration you’re going to have to apply that to every single element in the list. So that’s n items, if you have n elements. The second iteration, you technically have to do the check through the sorted subset almost every single time for every single element except for one, which is your first element. So the very first unsorted item you sort it by default, you don’t have to loop over everything because there’s nothing to loop over.
[00:14:54] SY: That’s true. Yeah. Yeah.
[00:14:55] VJ: So it’s like technically n minus one for the second loop, the nested loop, but like it’s basically the same as n. Right? You’re still going to have to sort through the sorted section in order to insert the thing at the right place. So you’re basically doing two loops.
[00:15:10] SY: Which is bad.
[00:15:12] VJ: Yeah, it’s bad.
[00:15:13] SY: I remember that being bad.
[00:15:16] VJ: You’re going to do two loops and you’re going to do it for every single element. So that means n times n again.
[00:15:25] SY: Again?
[00:15:26] VJ: I know. That’s actually happened to us every time. It happened for selection sort and it happened for bubble sort and it’s happening for insertion sort.
[00:15:34] SY: Just when I thought we were moving on up and we’re being sophisticated and just classy and you know, just making something of our lives, we are just right back at freaking n-squared.
[00:15:48] VJ: And in a quadratic runtime.
[00:15:50] SY: Okay, the tragic runtime. Okay. What about “but”? There are two complexities, right? There’s the time in the space. How are we doing for space? Is that interesting? Is that good at all?
[00:16:05] VJ: Okay. We won’t talk about time complexity because it makes us sad, but we can talk about space complexity which is that even though we went to nested loops, the amount of space you actually need is pretty constant because it doesn’t matter if you’ve got four items or four hundred. You’re working on the same inputted data, the same set of unsorted data, and what you’re probably doing is you’re just moving around elements based on their index and you’re shifting them and inserting them. So you’re just kind of moving pointers. So that’s like not that much memory you really need temporary.
[00:16:40] SY: That’s good.
[00:16:41] VJ: Yeah.
[00:16:42] SY: Silver lining.
[00:16:42] VJ: There you go. Temporary variables in memory and references to elements in an array, like that’s probably going to be o of 1 or constant time,sorry, constant space, constant amount of space. So the space complexity is constant. The time complexity is… we shouldn’t talk about it.
[00:17:05] SY: Okay. So in summary, insertion sort is similar to bubble sort and selection sort in that it sucks and it takes a long time to do because we’re back to our nested loops situation, which is always bad, and we’re just doing a lot of comparisons, iterations all over the place. So even though it looks like it’s cooler, it’s still really just the same thing. But, for, it’s all just a lie, it’s a trick but with insertion sort, it’s a little bit different because we are shifting things over and we have this like imaginary concept sort of like the sorted versus the unsorted sides and we’re moving things from one side to another even though it’s still live in the same list, the same array. We have a little division we’ve made for ourselves. So yeah. I think that’s about it. Is that right?
[00:18:02] VJ: Yup. I think we’ve inserted ourselves into that problem, now we’re ready to call it a day.
[00:18:07] SY: Nice. Well done. And that’s the end of today’s show. If you liked what you heard, please leave us a review and make sure to check out Vaidehi’s blog post. Link is in your show notes. Also want to give a huge thank you to Linode for sponsoring the show. Try their powerful servers built for all your infrastructure needs. Getting started with a shiny new server takes less than one minute so you can get up and running in no time. Get a $20 credit by using the promo code CodeNewbie2019. Just go to linode.com/codenewbie for more info. Also want to give a huge shout-out to DigitalOcean. They are the easiest way to deploy, manage, and scale your application. Everything about it was built with simplicity at the forefront; setting, deploying, even billing. Try DigitalOcean for free by going to DO.co/codenewbie and get $100 in infrastructure credit. Link is in your show notes. Vaidehi, you want to say goodbye?
[00:19:03] VJ: Bye everyone!
[00:19:04] SY: Thanks for listening. See you next week.
Thank you for supporting the show!