CodeNewbie

Description

We're back in our hash table classroom with our multiple Brians that need their own tables! But don't you worry, we've got a brand new collision resolution called chaining to help us out. We talk about how it works and how it compares to linear probing.

Show Notes

This episode of the Basecs podcast is based on Vaidehi's blog post, Taking Hash Tables Off The Shelf from her basecs blog series.

Transcript

[00:00:00] SY: (Music) Welcome to the Basecs podcast, where we explore the basics of computer science concepts. I'm your host, Saron, founder of CodeNewbie.

[00:00:09] VJ: And I'm Vaidehi Joshi, author and developer.

[00:00:12] SY: And she is the brilliant mind behind the basecs blog series. Today we're talking about...

[00:00:16] VJ: Hash tables. Again, but even deeper. (Laughing)

[00:00:21] SY: This season of the Basecs podcast is brought to you by our wonderful friends at Twilio. If you haven't checked out their API yet, you totally should. With Twilio, your app can send text messages and make phone calls with just five lines of code. And for a lot of developers, Twilio is the first API they've ever used. And that first time you type a few lines of code, run your program and see your phone light up with the text message? It's kind of magical. And as a thank you for being a podcast listener, you get a promo code for twenty dollars in free Twilio credit. Just text your name to 480-485-4321. You'll also get a link to a quick start guide for your favorite language. So text 480-485-4321. Link to that is in your show notes.

[00:01:10] And we're also sponsored this season by DigitalOcean. They provide the easiest cloud platform to deploy, manage and scale applications of any size. They remove infrastructure friction and provide predictability so you can spend more time building what you love. Try DigitalOcean for free by going to do.co/codenewbie and get $100 in infrastructure credit. Link is in your show notes. Ok. Let's get started.

[00:01:39] SY: So we've been talking about hash tables for two episodes now. Let's do a quick recap of what they are. What is a hash table?

[00:01:48] VJ: A hash table is a data structure that has two parts. It has an array where the data is stored and the data is stored in something called a hash bucket. And then you have a hash function, which is the thing, the function that's responsible for deciding which bucket the data goes into. And it's generally really good if your hash table is—doesn't have too many things in one bucket or in a bunch of buckets next to each other. Generally, you want your data in your hash table to be evenly distributed. And that is the job of the hash function.

[00:02:20] SY: And we made a lovely analogy last time to this idea of students getting their assigned seats in a classroom where the kids are outside and the teacher is the only one who knows where they're supposed to go. And so as the students, which are the data, data pieces—that's weird. You know what I mean. (Laughing) Little, little, little pieces of data outside and the teacher, which plays the role of the hash function says, "you are gonna sit here and you're gonna sit here and you're gonna sit there," and assigns each student a desk, which we can imagine as assigning a piece of data: a hash bucket.

[00:03:00] VJ: Yep. Totally.

[00:03:01] SY: And I'm very surprised that analogy worked as well as it did. (Laughing) Sometimes I just say things and, you know, hope they, hope they make sense. Ok, so one of the issues we ran into last episode was the, the problem of having two students: Brian. Two Brian's who—because they're both named Brian—end up in the same seat, which is no good. We don't like that. And we talked about how one way we can solve that is something called linear probing, which we described as saying, "hey, Brian. There's an empty desk right next to you. Just take that one." And if that one's full then say, "ok, go check out the next one and the next one," until Brian happens upon a empty desk where he can sit by himself. And we talked about how that's not always a good thing because it leads to clustering, which is, you know, all the data kind of grouped up in one part of the room. And it can be really slow. So if Brian has to keep looking for a new desk one at a time and there are like 100 different desks and the empty one is all the way on the other side, well then we can imagine how slow and time consuming that could be. And Brian won't really learn very much that day. So that's no good. (Laughing)

[00:04:21] VJ: This episode is now actually about Brian and his personal life. (Laughing) We'll be diving into his diary and how he had a hard day at school because his desk was super far away, and he couldn't see the board.

[00:04:33] SY: And you know, it's hard. It's really hard out here to be a Brian, you know? (Laughing) Just, just trying to, just trying to find a desk. Just trying to learn. But because there are some, some downsides to linear probing, you mentioned that there is another solution. What is that other solution?

[00:04:52] VJ: Oh, I'm so excited. This other solution is a great excuse to bring back—well it's gonna be like a throwback to like season one actually a little bit.

[00:05:01] SY: Oh.

[00:05:02] VJ: Yeah.

[00:05:03] SY: It's a real throwback.

[00:05:04] VJ: Because—well we'll get to the throwback in a second, but let me first tell you what it is. So...

[00:05:09] SY: Ok

[00:05:09] VJ: We have the one strategy, which is linear probing which has its problems, but there's actually another strategy that is pretty commonly used. And this is a form of collision resolution trying to solve the same problem but it's...

[00:05:20] SY: Collision resolution.

[00:05:20] VJ: ...something (Laughing)

[00:05:24] SY: I just feel like it needs its own jingle. You know what I mean?

[00:05:26] VJ: I know. It feels like...

[00:05:27] SY: Like every time you say we need to have like a collision resolution...

[00:05:30] VJ: I kind of imagine...

[00:05:31] SY: You know?

[00:05:31] VJ: ...that everybody's dressed like from the 80s.

[00:05:33] SY: Yes.

[00:05:34] VJ: And it, it's like a 80s music...

[00:05:35] SY: Yes. Yes.

[00:05:35] VJ: ...video is what you just have—I, I don't know. I'm getting those...

[00:05:38] SY: There you go. There you go.

[00:05:38] VJ: ...vibes through the microphone from you. (Laughing) Are you wearing your scrunchie? Is that the problem? Is that what's different today?

[00:05:44] SY: You know, I, I might need to go get one. (Laughing) I might need to go get one.

[00:05:50] VJ: Oh man. Anyways, let's, let's go back from the 80s. Let's, you know...

[00:05:56] SY: Ok.

[00:05:56] VJ: ...bring ourselves back to something called chaining.

[00:06:02] SY: Ok.

[00:06:02] VJ: So we've talked about the downsides of linear probing, and chaining is kind of a clever workaround to some of those issues. So in order to implement chaining, what you have to do is kind of restructure the hash table a little bit.

[00:06:18] SY: Ok.

[00:06:18] VJ: And let's think about it in the context of this classroom. Like we were saying there's like only one desk and only one student per desk. Maybe instead of that we have like one table and like all the A's sit at that table and it has like...

[00:06:30] SY: Oh.

[00:06:30] VJ: ...a couple chairs around it. And then there's another...

[00:06:32] SY: Ok.

[00:06:32] VJ: ...table. And it's like instead of individual desks, it's more like a collaborative classroom. There is one entry point to the, to the table. Like we could say there's like one team leader, one student who, whoever sits there first is like, "I'm the..."

[00:06:45] SY: Yes.

[00:06:45] VJ: "...team leader, and I'm responsible for all the B's or something. We'll see how long this analogy works. I'm not sure.

[00:06:51] SY: I love it so far. I am, I am right there with you. I'm at the S table. Ok?

[00:06:56] VJ: Are you the team leader?

[00:06:56] SY: Obviously I'm a table leader. Of course. (Laughing) Of course I'm team—I'm like the one with the biggest mouth, you know? I love it. I love it.

[00:07:02] VJ: You're just sitting there with your hand raised. We haven't even gotten to chaining, but you're like...

[00:07:05] SY: Permanently. (Laughing) Permanently hand raised. No questioning. No one's at the table.

[00:07:09] VJ: Team leader here.

[00:07:12] SY: I have something to say again.

[00:07:16] VJ: Oh man. Well in the way that chaining would work in this, in this I guess analogy is that we're kind of restructuring our hash table so that multiple elements can be stored at one key. So if we imagine that keys are like, you know, the letters of the alphabet so like S is a key and, you know, that would...

[00:07:35] SY: Ok.

[00:07:35] VJ: ...if you were passed through a hash function, you would end up at bucket S. And if I was passed through a hash function. I would end up at bucket V because my name starts with a...

[00:07:43] SY: True.

[00:07:43] VJ: ...V. There's still only one key per bucket, but now imagine that we have a way of putting many things in that bucket.

[00:07:52] SY: Oh.

[00:07:52] VJ: Not just the one like we were stuck with with when probing before.

[00:07:56] SY: Right.

[00:07:56] VJ: So chaining is when you can store multiple elements at one key. And the way that you can do that from a technical standpoint is that you can use a linked list instead...

[00:08:11] SY: Oh. Oh.

[00:08:11] VJ: ...of a, just a single element hash bucket.

[00:08:16] SY: Oh interesting. Ok question.

[00:08:18] VJ: Yeah.

[00:08:19] SY: Before—I'm raising—I literally raised my hand. I'm not even joking.

[00:08:22] VJ: All right. That S bucket.

[00:08:23] SY: I'm, I'm not... (Laughing)

[00:08:25] VJ: Look.

[00:08:25] SY: I'm literally—I'm alone. I'm alone right now, Vaidehi. Like no one is here. And I legit just—you can't see me, and I legit raised my hand. What is wrong with me? Ok. Ok, so, so question. So when we're talking about Brian and the single desk scenario—the SDS—(Laughing) there is a limit, right? There's a limit of one person per desk, one student per desk. With this new scenario, we have this, this table situation where are, there are lots of seats. Is there still some type of limit? Are there a limited number of seats that we have to worry about or think about at this point? Or do we have more flexibility there?

[00:09:06] VJ: Oh, that's a great question. I'm gonna answer it in a second, but the short answer is yes. You do wanna worry about how many seats, but...

[00:09:13] SY: Ok.

[00:09:14] VJ: It'll become more apparent why. But yeah.

[00:09:15] SY: Ok. Ok, cool.

[00:09:17] VJ: Ok so the way that we basically store a single item at each hash bucket is that we use a linked list. And if you think about it from, you know, the concept of data structures, you can basically have the key still point to one thing the same way it did when Brian was sitting at one desk when you had one element in one hash bucket. But that thing that it's pointing to is kind of the team leader. It's, it's pointing...

[00:09:42] SY: Oh.

[00:09:42] VJ: ...to the reference to the linked list. So if you'll remember how linked lists work, there is one node, and it has a reference to its neighbor and that node has a reference to its neighbor. And it's kind of like, you know, they're all chained along. The person at the front of the chain doesn't really know—they can't talk to the person at the back. You have to kind of work your way...

[00:10:00] SY: Right.

[00:10:00] VJ: ...through the linked list, kind of work your way through the chain. But the way that that kind of works out in the context of a hash table is that instead of storing one item at a hash bucket, you store a pointer to the beginning of a linked list. And maybe that linked list is just one person in the case of you at your table. Or maybe it's actually four people, and it's like you're pointing to the team leader, Brian, who has a reference to Brandon sitting next to him. And Brandon has a reference to Bristopher. I'm really bringing Bristopher brack, brack, back.

[00:10:32] SY: Yeah, man. You should. That's...

[00:10:34] VJ: Oh my God.

[00:10:34] SY: That's right. Bristopher is always there.

[00:10:37] VJ: So I guess the, the way that the chaining, you know, you that you can really visualize it is that you have a chain of elements because you have a linked list that is...

[00:10:48] SY: Right.

[00:10:48] VJ: ...what you access from the hash bucket itself.

[00:10:51] SY: Interesting. Ok, so in this scenario going back to the original question of "I have two Brian's. What do I do with them? In the chaining situation, is it that I already have this hash table that has these link li—what, what words was that? (Laughing) linked list already set up? Or is it at the point when I have my two Brian's I go "oh crap. We gotta reorganize this classroom. No more desks. Only big tables."

[00:11:26] VJ: Well if you are a clever teacher—clever hash function or I guess really the person who's designing the hash function whoever that might be—they kind of need to think about this and be like "ok, when a collision happens what am I gonna do? Because as we mentioned in the previous episode, collisions are gonna happen, so you gotta account for collision resolution. That's kind of the point when you're designing the hashing function that you think, "oh, based on this data, I should probably use chaining." And at that moment like kind of in the designing phase you think about how you're going to set up the classroom if that makes sense.

[00:12:01] SY: Ok. So chaining isn't necessarily something that happens, you know, in, in real time. Ideally, it will be already established so that it can better handle collisions that are expected to happen.

[00:12:13] VJ: Yeah. I think when you use a, a hash table in practice you've kind of already thought about this or, you know, you think about it early enough on that you're not reorganizing once you've...

[00:12:23] SY: Right.

[00:12:24] VJ: ...already hashed a bunch of data.

[00:12:25] SY: That sounds messy. Yeah, that sounds like...

[00:12:28] VJ: Can you imagine reorganizing a classroom with like, you know, 15 children in it? Or 30?

[00:12:32] SY: When the school day has already started? Ugh, sounds like a mess.

[00:12:35] VJ: Seems like a bad idea. (Laughing)

[00:12:36] SY: Seems like a bad idea.

[00:12:38] VJ: But the way that it would work...

[00:12:39] SY: I agree.

[00:12:39] VJ: ...with like Brian, for example, or if, if you were the team leader—imagine that, you know, at first nobody is sitting at the S table, but there is still a...

[00:12:47] SY: Right.

[00:12:47] VJ: ...there's still a way to access it. You just kind of start at the team leader's chair, which we can say is the front of the linked list.

[00:12:54] SY: Right.

[00:12:54] VJ: And then you show up and then we're like, "oh, you go into bucket S," and now you're sitting at the front of the linked list. You are, you know, the reference pointer to the beginning of the list. And then someone else who is an S shows up. And now you can, we can put them as the team leader at the front of the list.

[00:13:09] SY: What? Wait, I get replaced?

[00:13:11] VJ: Well, I mean you—we need a way...

[00:13:12] SY: Oh, that's just wrong.

[00:13:14] VJ: We need a way of, we need a way of having a reference to you, but then also creating a reference to the team leader.

[00:13:20] SY: Fine.

[00:13:20] VJ: Because remember we only have access—the way that linked lists work is you have a reference to the beginning of it.

[00:13:25] SY: Whatever. I don't even care. (Laughing) I don't even care anymore. After all this hand raising I did. God. Ok, so basically if someone kind of—every person gets almost like bumped down in, in a sense because you're adding to the beginning of the list...

[00:13:43] VJ: Yes.

[00:13:44] SY: ...each time.

[00:13:44] VJ: When you hash them into the appropriate bucket, you're adding them to a linked list, which may be empty. Sometimes it's just a pointer to an empty chair, to an empty list. And sometimes, you know, you have like two kids that are sitting next to each other, and you, you have a designated way of getting things from that bucket.

[00:13:58] SY: Ok. So now can we go back to that other question of "is there a limit to how many seats I have available or how many things I can add to my linked list?"

[00:14:07] VJ: Yes. That's a good, good point. Ok, so actually let's, let's think about the example where we have like five Brian's or whatever we were talking about where we have a bunch of, bunch of students...

[00:14:18] SY: Oh so many.

[00:14:18] VJ: ...with B's as names. Imagine now we're just like adding a bunch of students to the table, to the hash bucket rather that is for B's. Well now you have a linked list so you still have one point of entry, but now maybe your linked list is like five elements long. And adding the elements is easy.

[00:14:37] SY: Right.

[00:14:37] VJ: But then what happens if you now need to find one of the Brian's? Or, you know, what happens if you have like a bunch of students sitting at one table and you're like "well I need to find this one specific thing, this one specific data point or this one specific child in this classroom example." Now I have to search through this table of students.

[00:14:59] SY: Yeah.

[00:15:00] VJ: I have to search through the linked list to find what I'm looking for. And we'll remember from all of our knowledge on linked lists the way that you search through a linked list is linearly and you can only...

[00:15:14] SY: Right.

[00:15:14] VJ: ...go one at a time, which means...

[00:15:16] SY: Oh man.

[00:15:16] VJ: ...it takes as much time as elements. Or in the context of Big O notation it takes O(N) time. So if you have one table that's like the B's and there's 20 kids sitting there and then you have Saron sitting by herself and Vaidehi by herself and like I don't know, Chad over there. Like now you're like, "oh my God, whenever I have to call on anybody with a B, I'm gonna have to search through this big table." And it's just...

[00:15:40] SY: Oh man.

[00:15:41] VJ: Now, now you've also kind of negated the point of a hash function because hashing functions...

[00:15:45] SY: Right.

[00:15:46] VJ: ...and hash tables are supposed to be like quick to retrieve things. And this is not...

[00:15:49] SY: Right.

[00:15:50] VJ: ...gonna be quick.

[00:15:51] SY: Yeah, no that's gonna take for, forever. What are we gonna do? Can we do something? Vaidehi, come up with a solution quick. (Laughing) School has started. Go.

[00:16:01] VJ: The children are arriving. Well I guess the—if the impetus kind of falls on the teacher or, you know, the person who is designing the hash function because again, you wanna think about this, which is good that we're kind of, you know, enumerating the different strategies and their downsides and all the situations...

[00:16:21] SY: Yeah.

[00:16:21] VJ: ...that could arise. Because when you're designing a hash function you do wanna think about this because inevitably some version of it might happen unless, you...

[00:16:29] SY: Yeah.

[00:16:29] VJ: ...know, your data really well and you know that it won't happen. But there's a good chance some, some version of one of these things is gonna happen. But I guess the, the answer to your question is you wanna make sure that your hash function doesn't put 50 percent of the classroom at one table. (Laughing) Like you've gotta not do that.

[00:16:45] SY: Teacher. Yeah. Yeah okay, so, so in that, in that example, right? So ideally the teacher who's the, the hash function—and for the sake of, you know, of this example is also the person who designed the, the hash table and designed the classroom—she would probably have a list of students at her disposal before they get there, right? So she can look at it and go, "holy crap. There are 20 B's in this, in this classroom. My original idea of having one alphabet per, you know, per, per table is probably not a system that's going to work out well because the B table will have 20 people on it." So the teacher would have to go, "ok, I'm gonna reorganize my tables to be, you know, Ba- through Bd. Crap. What's after D? Is it E?

[00:17:36] VJ: Or you could just do like birthdays.

[00:17:38] SY: No I wanna do B's still. (Laughing) Can I do B's still?

[00:17:44] VJ: Sure we could do that.

[00:17:45] SY: Ok. Ok. Thank you. So Be- through Bh- and then, you know, and so on and so forth. So then there are actually five different B tables in which case if there are, you know, 20 B's and it's four per table and then it kind of balances out throughout. So the real solution it sounds like is when you're designing hopefully if you have a good sense of your data and what you're gonna do with that hash table, then you should be able to design around it at least somewhat.

[00:18:13] VJ: Yes. Yeah. I do wanna point out that chaining is actually—I'm not trying to say that it's a bad strategy. It actually is still good, and I think you...

[00:18:22] SY: Yeah, it sounds pretty useful.

[00:18:23] VJ: You should consider it. But if you do it well and if you really thought about, "ok, I have two students are in this at this table or I have three elements at this one bucket," chaining itself isn't that slow because if your linked list is short...

[00:18:38] SY: Right.

[00:18:38] VJ: ...searching through it isn't hard.

[00:18:39] SY: Right. Right.

[00:18:40] VJ: And in fact in the context of Big O, chaining averages out to constant search time, which is O(1).

[00:18:46] SY: Oh wow. Oh, that's pretty good.

[00:18:48] VJ: It is still pretty quick because if you have a linked list of two things, that's not that difficult to search through.

[00:18:53] SY: Yeah, that's pretty...

[00:18:54] VJ: Even three things, you...

[00:18:54] SY: That's pretty fast.

[00:18:54] VJ: ...know? But, but you don't want 50 percent of your data...

[00:18:58] SY: Right.

[00:18:59] VJ: ...to be chained at one bucket (Laughing) 'cause that's not slow. Or no, that is slow.

[00:19:03] SY: Well so in that example—yeah—(Laughing)

[00:19:06] VJ: That's not slow. That is slow. Oh my God. Maybe I need to go back to the classroom. I don't know what's happening to me.

[00:19:15] SY: Ok, so in that example, right—so even if I'm doing chaining but I, you know, don't do it quite right and I still end up with 50 B's in my, my one linked list, is that also clustering? Does that count as clustering?

[00:19:31] VJ: It does kind of present the same problem as clustering, right?

[00:19:34] SY: Right.

[00:19:35] VJ: Where your hash function will be like, "all right. I know that I need to quickly go to this, you know, specific key and grab this thing, but when it arrives at that key, it's kind of like, "wait well I don't know which one, so I guess I have to search through this clump of data and figure out what it is." Which is something similar where it was with clustering when all of your data is at one end of the table maybe or, you know, segmented in one section, well now you're not really making retrieval very easy because...

[00:20:03] SY: Right.

[00:20:04] VJ: ...you're not really picking a good hash function that's going to make it easy to compute where something lives because that's, that's the real advantage of hashing when it's done well.

[00:20:14] SY: And that's what's so interesting about the difference between linear probing and chaining because linear probing is when I have that collision, what do I do? Well I just have a very, very, very simple algorithm that says, "go to the next one. Go to the next one. Go to the next one." And so there's really kind of no way to optimize for that. Like it's just, it's the same thing over and over again. Whereas for chaining, the value of it isn't the algorithm that kicks in when a collision happens, it's more of the initial setup. And that setup is determined by the hash function. So it's almost like the better the hash function, you know, the thing determining what the layout of the classroom is gonna be like, the more likely you are to get the most out of that chaining setup.

[00:20:59] VJ: Yeah. And I, I think the, the fact that a linked list is a kind of setup is kind of cool because that's a little bit of the hash table and not the hash function, right?

[00:21:09] SY: Right. Right. Right. Yeah.

[00:21:10] VJ: Because that's like what is the structure of the table? What is its skeleton? And it's like oh, instead of now one item, I got a bunch of pointers to linked lists. And now you're leveraging another data structure within the hash table, which is kinda cool...

[00:21:23] SY: Whoa.

[00:21:23] VJ: ...because I know that linked lists are very good for this. Why don't we just plug 'em in right here and use them because we know that they'll help us here if we, you know, follow a couple constraints.

[00:21:34] SY: Ok, so that hash function is, is really—I almost wanna say like the most important part of this hash table. So what makes a good hash function?

[00:21:44] VJ: That's a great question. You wanna make sure first of all that it's very easy to compute. And what I mean by that is it should be easy to look up something. It should be efficient to retrieve and insert data and find it again in the future because that's, that's kind of the whole point of hash functions, so... (Laughing) and, and hash tables, really. That's like the most important thing about...

[00:22:07] SY: Right.

[00:22:08] VJ: ...what makes a good hash function because if they're not easy to compute, the rest of it is kind of out the door, out the window. (Laughing)

[00:22:15] SY: Out the door and the window. Out the whole building.

[00:22:15] VJ: Out of all the windows and doors and plumbing and everything.

[00:22:18] SY: Any exit. Pick your favorite exit. And when we say easy, are we talking about speed? Are we talking about computational power like, like processes? Like what, what—how are we defining the ease?

[00:22:35] VJ: how quick it is, really.

[00:22:37] SY: Ok. Ok.

[00:22:37] VJ: And whether you've thought about situations where there are collisions, for example. And actually that's, that's a good second point to say is that a good hash table and hash function avoids collisions because if you haven't thought about that, then it's not gonna be easy to determine where something goes or where to get it from because as soon as you're like, "oh, put in the second thing that lives here," and you haven't accounted for it—well now your hash function's kind of stuck.

[00:23:02] SY: Ok.

[00:23:03] VJ: So that, that also has something to do with the ease of it. And, you know, you can have some—there are some hash functions that are computationally they feel complex to us, but for a computer it's not that hard.

[00:23:15] SY: Ok. Anything else that makes a hash function good?

[00:23:19] VJ: I guess one other thing would be—this might seem a little bit obvious but I wanna say it—a hash function should always be able to handle all the input data. It should be able to use all the input data.

[00:23:29] SY: Yeah.

[00:23:29] VJ: And it should always return the same key every single time you put in an element in the same bucket. So it should be...

[00:23:35] SY: That sounds important.

[00:23:35] VJ: ...consistent. Yeah. (Laughing)

[00:23:38] SY: Reliable. Consistent.

[00:23:38] VJ: Can you imagine like sometimes the teacher was like, "oh, sometimes I'll put a student here, and sometimes I'll put it there." And then you're like, "where did I put it? I don't know. I have no, you know, systematic way of doing...

[00:23:46] SY: Surprise. Yeah.

[00:23:47] VJ: ...this. So I guess that's the other thing. But yeah, the things that make a hash function—there's not that many of them, but they're so important...

[00:23:55] SY: Right.

[00:23:55] VJ: ...that we spent three episodes talking about it. (Laughing)

[00:23:58] SY: Awesome. So with our example, we've used this classroom to kind of explain how hash tables work and linear probing and, and all that. Is there a situation where I can expand my classroom? Because I'm thinking if I'm dealing with a ton of data maybe I get some, you know, some students that I, I wasn't expecting or maybe one of the teachers was like, "screw these kids. I'm not doing this anymore, and I need to like make room for a whole second, you know—I almost said a second batch of students—a second like, you know, a set of students, a class of students. Is there an option of expanding the classroom on demand if I realize I don't have room either at these tables or even just there just aren't enough tables?

[00:24:47] VJ: You can expand the hash table, but I would say there needs to be like a caveat here which is that you don't want to expand it and have a ton more buckets...

[00:24:57] SY: Ok.

[00:24:58] VJ: ...because you could very likely end up with a situation where now you just have a student per bucket. And now you've just got a ton of buckets. And now it's just basically like an array.

[00:25:07] SY: Yeah.

[00:25:08] VJ: And now you are like, "oh, to find anybody I have to you know, go in order row-by-row. Really, I think what the solution is is as your data changes or, you know, as you assess what your data looks like and what your hash table looks like, you probably wanna think about your hash function more.

[00:25:25] SY: Ok.

[00:25:26] VJ: Or if you're gonna add buckets then you, you wanna kind of limit them and not go too crazy...

[00:25:31] SY: Right.

[00:25:31] VJ: ...with how many you add. And if you do add some, you really wanna think about what is it gonna be like for me to search through this? And is there a way that I can, you know, rehash my hash function to figure out a way that I can handle this data but not end up with anything, any clustering or some, some (Music) end of the hash table being heavier than the other.

[00:25:51] SY: And that's the end of today's show. If you liked what you heard, leave us a review, and make sure to check out Vaidehi's blog post. Link to that is in your show notes. Also wanna give a huge shout out to Twilio for sponsoring the show. Make sure to check out their awesome API. To get your twenty dollars in free Twilio credit, text your name to 480-485-4321. Link to that is in your show notes. Also wanna give another shout out to DigitalOcean. They are the easiest way to deploy, manage and scale your application. Everything about it was built with simplicity at the forefront. Setting, deploying, even billing. Their support is amazing. They've got hundreds of detailed documents and tutorials. So if it's your first time deploying an app, don't worry. They've got great tools and community to make it nice and easy. Try DigitalOcean for free by going to do.co/codenewbie and get $100 in infrastructure credit. Link is in your show notes. Vaidehi, you wanna say good-bye?

[00:26:50] VJ: Bye, everyone.

[00:26:51] SY: Thanks for listening. See you next week.

[00:26:56] VJ: I, I—wow. I'm, I, I'm trying to say a lot of things, and I just keep saying "I." (Laughing) Yes, we get it, Vaidehi. "I."

Season 4 Episode 3 November 14, 2018

Chaining to the rescue!

Description

Show Notes

Transcript