Description

So we've talked about heaps, but how do you represent heaps as arrays? And why would you want to? We break it down step by step!

Show Notes

This episode of the Basecs podcast is based on Vaidehi's blog post, Learning to Love Heaps from her basecs blog series.

Transcript

[00:00:00] SY: If you haven’t yet gotten your tickets for Codeland, you totally should. It’s our annual conference about all the wonderful things you can do with code. And besides great food, great talks, and great people, this year we’re offering complimentary on-site childcare. So bring your babies with you and see you there. For tickets, go to codelandconf.com.

[00:00:23] (Music) Welcome to the Base.cs Podcast where we explore the basics of computer science concepts. I’m your host Saron, Founder of CodeNewbie.

[00:00:30] VJ: And I’m Vaidehi Joshi, Author and Developer.

[00:00:32] SY: And she is the brilliant mind behind the Base.cs Blog Series. Today we are continuing our conversation on…

[00:00:38] VJ: Heaps! Again.

[00:00:40] SY: This season of the Base.cs Podcast is brought to you by Linode. If you’ve got a personal project, a small business or a big business with lots of data, Linode offers you a secure hosting for all your infrastructure needs. They are a Linux Cloud Hosting provider where you can get a new server up and running in under a minute. Plans start at one gigabytes of RAM for just five bucks a month. And with the promo code CodeNewbie2019, you can get a $20 credit. So go to linode.com/codenewbie and give it a try. Also, they’re hiring. Check out their jobs at linode.com/careers. Link is in your show notes.

[00:01:17] It’s also brought to you by Square. If you’re designing a website or building a mobile app, you’re probably going to want to take payments at some point. Square APIs allow you to easily implement their payment form on your website. And with their In-App Payments SDK, you can add it to your mobile app too. You don’t have to worry about dealing with PCI compliance because they’ll take care of that for you. Don’t let anything come between you and your money. Start building with Square over at squareup.com/go/codenewbie.

[00:01:49] SY: This is our, I think, third and final episode on heaps?

[00:01:52] VJ: Uh-hmm.

[00:01:52] SY: Yeah?

[00:01:53] VJ: Well, they’re going to come back when we sort through them.

[00:01:56] SY: This will be the end of this chapter.

[00:01:58] VJ: Yes, exactly.

[00:01:59] SY: And then we’ll encounter a new chapter.

[00:02:01] VJ: It’ll be a whole new world of heaps.

[00:02:03] SY: There you go. Just heaps on heaps on heaps. That’s never going to get old to me, by the way. I’m always going to love that. Okay. Let’s do a quick recap. Tell me what heaps are all about.

[00:02:13] VJ: So heaps are a kind of data structure that have to follow two rules. They have to fulfill a certain type of shape. They look kind of like binary trees and we’ve been talking about binary heap. So for the purposes of heaps in this episode and in the rest of these episodes, we’re talking about binary heap. So they look like binary trees, but they actually only follow two important rules and that is that they have to be level complete and they maintain a certain kind of structural integrity and also they have to follow a Heap Order Property. So they may either be a max-heap where all the parent nodes are greater in value than their children or they’ll be a min-heap where all the parent nodes are less than or equal to the value of their children.

[00:02:53] SY: So today we are talking specifically about how to represent heaps and we are going to be representing them as arrays.

[00:03:01] VJ: Yeah. I think it’s actually really cool because we’ve talked about so many different data structures and heaps are this beautiful merging of binary trees, arrays, and also, one of my favorite data structures, queues.

[00:03:17] SY: Oh, I forgot about queues.

[00:03:19] VJ: Well, they’re back.

[00:03:19] SY: Everybody just comes out. It sounds like a family reunion.

[00:03:22] VJ: Hey, a new season, new family. I started that sentence and I was like, “I don’t know what the rest of it is going to be.”

[00:03:31] SY: Worrying about that? Sure, new season. It’s actually not even new family because these are old family members, new reunion. How about that?

[00:03:39] VJ: Oh, yeah.

[00:03:39] SY: New season, new reunion. There we go.

[00:03:41] VJ: They’re just back to haunt you. They’re just like, “Hey!”

[00:03:44] SY: No, this is a good thing. This is a positive thing, Vaidehi.

[00:03:48] VJ: They’re like, “Hi! Just paying a visit, bringing you a casserole.”

[00:03:56] SY: As families do. Okay. So heaps as arrays. First of all, why do we want to represent a heap as an array? What does that do for us?

[00:04:07] VJ: They do tie into queues and remember that arrays and queues have a relationship.

[00:04:12] SY: Yes.

[00:04:13] VJ: They have similar structures. They’re both linear data structures, but the other reason that we should start off with is that arrays have indices.

[00:04:22] SY: Ah, yes.

[00:04:23] VJ: The nice thing about heaps is if you can represent them as arrays and you get all the benefits of it being an array, like for example, you can index into it and retrieve something, you get the benefits of this other data structure, even though you’re using another data structure and representing it in a different way.

[00:04:42] SY: Okay. So we are going to represent a heap as an array. What do we do first? What’s step one?

[00:04:48] VJ: The thing to think about is how we’re going to order this heap and we know that they’re partially ordered in the sense that they follow that Heap Order Property, right? And we know that the root node is either going to be the largest or the smallest in value depending on if it’s a max-heap or a min-heap. So because we know that the root node is either going to be a max value or a min value and because we know that it’s going to be at the top of the heap, it kind of makes sense for it to be at the front of the array and because we have indices, we can put it into our array, the value, whatever the value of that root note is, we can put it into the array at index 0 so that it ends up at the beginning of the array.

[00:05:29] SY: Okay. So we’re going to give the root node an index of zero, which makes perfect sense to me, it’s like the first thing in the list, in the array. Cool. What do we do next? Because we got two babies. We got two children, right? Like if we have root node and then we say that we have a tree and we have two kids, our left child and our right child, how do we know where to put them?

[00:05:48] VJ: So if we know the index of the root node, we can kind of see the rest of the heap fall into place in the array.

[00:05:57] SY: Okay.

[00:05:57] VJ: And the reason for this is because if we know the index of the root node, we can manipulate that index in order to determine where its children are going to live in the array and there’s a handy formula, there are actually two formulas, that can help us do this and we won’t get into like the math behind it, but…

[00:06:15] SY: That’s totally fine with me.

[00:06:17] VJ: But they’re going to play around with the index. And these are basically index-based algorithms that help us figure out where each child should live in the array. I think we should start with the left child. Let’s say that the root node has an index of I and we know that at zero, but let’s just say that it has an index of I. If we have a parent node that has an index of I, the left child is always going to be located at 2, multiplied by I, plus 1 and that’s going to be the index. It’s kind of like the address for the node.

[00:06:48] SY: Okay.

[00:06:49] VJ: And the right child is going to be 2, multiplied by I, plus 2.

[00:06:55] SY: Okay.

[00:06:56] VJ: So if you think about the fact that it’s 2 I plus 1 and 2 I plus 2, this starts to become obvious that the two children are going to sit next to each other in the array, right? Because it’s two I plus 1 and plus 2.

[00:07:08] SY: Right, just moving over.

[00:07:09] VJ: Yeah. So you know that their indexes are only going to be off by one.

[00:07:12] SY: Okay. So in this example, if we have our I, we have a root node at index 0, which means that the I we’re dealing with is zero and we want to know where is our left child, we would go 2 times 0 plus 1, which is just 1.

[00:07:30] VJ: Yup. So you know that the left child is just going to be right next door at index 1.

[00:07:34] SY: Yeah. Yeah, at index 1. Okay. Okay. And then for the right child, we do 2 times 0, plus 2, which is 2. So it just goes right next to the left child.

[00:07:45] VJ: We’ll have the root node at index 0, the left child at index 1, and the right child at index 2.

[00:07:51] SY: Okay. So what if we want to see where the grandchild is going to be? You know what I mean?

[00:07:58] VJ: Yeah.

[00:07:58] SY: So we have like our root node. We have the left child and let’s say we have the left child of the left child. How do we know where that one is?

[00:08:06] VJ: Well, so we already know the grandchild’s parent, right?

[00:08:10] SY: Yes.

[00:08:10] VJ: The grandchild’s parent is just the child. Sorry.

[00:08:13] SY: Right. Okay.

[00:08:14] VJ: Because the family is getting out of control here. We have the root node, a left child and the right child, and the left child has its own left child. So we know the root node’s index, we know the left child’s index. We don’t know the grandchild’s index, but we do have these two formulas. And in the case of this formula that we’re concerned with, we just care about the left child formula. So that’s 2 I plus 1.

[00:08:36] SY: So any left child is a 2 I plus 1?

[00:08:38] VJ: Yes. Yeah. Remember earlier when I was like, “Oh, the root node’s index is 0, but let’s just call it I.” There was a reason for that because this formula applies to any set of parent and children nodes. So in this case, we can say that our parent is our left child.

[00:08:56] SY: Okay.

[00:08:57] VJ: So what is the index of that?

[00:09:00] SY: Okay. So the index of that very, very first left child, the one going straight from the root is 1 because we already calculated that. We said it was 2 I plus 1, which is 2 times 0 plus 1 which is 1. So that’s an index 1.

[00:09:13] VJ: So that is our I.

[00:09:15] SY: Okay. That’s our new I.

[00:09:16] VJ: Exactly. That’s the equivalent index of the parent and in our case the parent is just a left child.

[00:09:21] SY: Okay.

[00:09:21] VJ: So if we want to figure out the left child of this new parent, we can just apply the same formula and in this situation I is going to be equal to 1.

[00:09:30] SY: Yes. Okay. Okay. So now we would do 2 times 1 plus 1 which is 3.

[00:09:37] VJ: Exactly. And now we know the index of the grandchild.

[00:09:41] SY: That makes a lot of sense and then we can just keep going all the way down. So if that grandchild has a child, well, then that grandchild’s index, that’s the new I.

[00:09:50] VJ: Yup.

[00:09:51] SY: And then we can figure out its left child and right child by doing 2 I plus 1 or 2 I plus 2.

[00:09:56] VJ: Exactly.

[00:09:57] SY: That’s pretty cool.

[00:09:58] VJ: Yeah. It’s kind of interesting because it works every time, which is like, “Oh, this is a dependable way of figuring out how to make those pieces fall into place,” right? How do we make them the nodes in this heap fall into place? Well, we follow this formula and we just change what the parent index is depending on which parent we’re looking at.

[00:10:17] SY: Yeah. So basically as long as I know the index of the parent, I can always figure out the index of the children. Okay. So then can I do it backwards? Can I know the index of the parent if I know the index of the child or the children?

[00:10:32] VJ: Yes, you can. The beauty of this is that if you can figure out where the children live and if you have a node and you want to go back and find its parents, by virtue of the fact that you know that you can go up or down in terms of generations in this heap, you can actually construct the heap, right? Because if you have the child and you want to find its parent or if you have a parent and you want to find its child, you just have to change what formula you’re using.

[00:10:58] SY: Yeah.

[00:10:59] VJ: So I guess I should tell you this formula since that’s where we started.

[00:11:01] SY: Yes, please. Yes, please. Thank you.

[00:11:04] VJ: So it’s interesting because the formula to find children was just 2 times the index plus 1 or plus 2, right?

[00:11:11] SY: Yeah.

[00:11:11] VJ: So to find the index of a node’s parent, it is I minus 1 divided by 2.

[00:11:19] SY: I minus 1 divided by 2.

[00:11:21] VJ: Yeah. That’s going to give you the index of the parent node. You can be given any index in the array and be like, “Well, tell me where its parent lives,” and then you can apply this formula and figure out, “Oh, its parent is actually located at index whatever.”

[00:11:35] SY: Interesting.

[00:11:36] VJ: The one other thing I do want to mention is that a lot of the times when you’re reading about this formula, you will notice that people will talk about the floor and they’ll say, “You’re finding the floor of I minus 1 divided 2,” and it’s just a shorthand. It just means that you’re rounding down a decimal to its closest integer. So for example if I minus 1 divided 2 was 2.8, you would round it down to 2.

[00:12:00] SY: Cool. The floor is lava. I don’t know why I say that. Okay. So earlier in the conversation, you mentioned queues. You talked about queues and that we might revisit them in this world of heaps. Is that true? Is that coming up?

[00:12:15] VJ: Yes. That is the last thing I want to talk about which is that we know that queues are data structures that follow the first in, first out principle, right? We’ve talked about that in previous episodes many times. We’ve talked about FIFO.

[00:12:27] SY: FIFO.

[00:12:27] VJ: Yeah, exactly. And you see queues used in a lot of places in the real world. They’re used to manage requests, scheduling things in your CPU, for jobs and workers, things that are happening in the background or within your operating system. Those all use queues. And usually the type of queue that they’re using is something called a priority queue. A priority queue is basically like a queue that has like some weight to it. So in a priority queue, every item has some sort of priority or weightage associated with it. It’s usually just like an integer or something like that. And the items with the higher priority are the ones that are going to be dequeued first and the things with the lower priority are dequeued last or later on. And if there are two things that have equal priority, then then they’re dequeued based on their order in the queue. So if you have two items that have a priority of like one, you just take the one that comes first in the queue and you follow the first in, first out principle. So the cool thing about this is that in previous episodes we talked about how heaps allow you to have duplicates.

[00:13:30] SY: Oh, yes. Yes.

[00:13:31] VJ: So the fact that heaps allow you to have duplicates and the fact that a heap is always ordered with the highest or the lowest number at the root and then following that same rule, the Heap Order Property, down the structure makes it a perfect structure that can be represented as a priority queue because you know that you can have the first item. Whether it’s the largest or the smallest, you’ll know that you can have it at the front of the queue with the highest priority and then you can just pop it off and do whatever you need to do with it. And the things with the lower priority, the smaller numbers, for example, whether there are duplicates of them or not will be at the end of the queue which is pretty cool because now you have a priority queue structure that you just derived from a heap structure.

[00:14:14] SY: So which came first, the heap or the priority queue? Because all these things are so interrelated and you can represent one thing as another thing and it kind of makes you wonder, like how did we get to this point? Like how did we design this ecosystem? You know, like what led to what?

[00:14:29] VJ: There’s a good question. There’s a podcast I want to do one day about like the history of how things happened in computer science and like who discovered what and how that led to something else, specifically with CS concepts. That’s very cool. But if I had to guess, I want to say that the queue came first because I feel like things being linear is an older concept in computing and it seems like trees and graphs came a little bit later, is what it seems like, but nobody quote me on that.

[00:14:58] SY: Vaidehi, anything you say I believe. So I’m fine with that.

[00:15:03] VJ: The other cool thing about queues and heaps, just a last little note for all the nerds out there, because a heap can be represented as priority queues you actually get some benefits and that usually has to do with the time space complexity, right? So because we know that a queue takes a constant amount of time to retrieve the first element, that means if you represent a heap as a queue, you can find them, the largest or the smallest element, in constant time. So O of 1, which is really cool. In the same way, insertion and deletion of elements in a queue takes logarithmic time or O Log N. So if you’re representing a heap as a queue and you’re just inserting things using that rule, well, now you get some benefits of how quickly that can happen.

[00:15:48] SY: What I love about that is how excited you are to share that.

[00:15:54] VJ: I know.

[00:15:55] SY: I can hear the joy.

[00:15:56] VJ: Don’t end it yet. I have to tell you.

[00:16:01] SY: I could just feel the joy radiating from your body.

[00:16:05] VJ: I have a very big grin on my face. So that’s radiating too.

[00:16:09] SY: I can hear it. It’s great. Awesome. So does that just about cover heaps at least for now?

[00:16:15] VJ: Yeah. I think that is a good introduction to heaps and I think the really nice thing about this is that this is our first data structure, I think, that we start to see so many things come into play. It’s interesting because now that we’re kind of in the second half of the Base.cs series, we’re going to see more of that. We’re going to see different data structures come together or two algorithms kind of merging and you’re like, “Oh, this feels like a combination of different things.” And it’s pretty cool because now we get to see how we actually are standing really tall on the foundation we built and now we learned all these things and now we’re like, “Hey, we know that. Let’s just plow and reference that and we know what we’re talking about.”

[00:16:54] SY: Yeah. Let’s just bring some things together, have our family reunion. That’s what it’s all about.

[00:16:59] VJ: A lot of reunions and casseroles in our future.

[00:17:02] SY: I’m not mad at that. And that’s the end of today’s show. If you like what you heard, please leave us a review and make sure to check out Vaidehi’s blog post. Link is in your show notes. Also want to give a huge shout out to Linode for sponsoring the show. Linode is giving you a chance to try their powerful servers built for all your infrastructure needs. They’ve got nine data centers worldwide with new data centers opening this year in India and Canada. Getting started with a shiny new server takes less than one minute so you can get up and running in no time and you can get a $20 credit by using the promo code CodeNewbie2019. Just go to linode.com/codenewbie for more info. And want to say thank you to Square. Square has APIs and SDKs to make taking payments easy whether you’re building a mobile app in iOS, Android, Flutter, or React Native. If you want to embed a checkout experience into your website, it’s super easy. Square has processed billions of transactions so you know it’s tried and true. Start building with Square over at squareup.com/go/codenewbie. This episode was edited and mixed by Levi Sharpe. Vaidehi, you want to say goodbye?

[00:18:13] VJ: Bye everyone.

[00:18:14] SY: Thanks for listening. See you next week!

Thank you for supporting the show!

Thank you for supporting the show!