#### Description

In this episode, we continue our talk on Radix Trees and introduce the Practical Algorithm To Retrieve Information Coded In Alphanumeric trees, also known as PATRICIA trees. Yeah, I think we'll just stick with calling them PATRICIA trees.

#### Show Notes

This episode of the Basecs podcast is based on Vaidehi's blog post, Compressing Radix Trees Without (Too Many) Tears from her basecs blog series.

#### Transcript

[00:00:02] SY: (Music) Welcome to the Base.cs Podcast where we explore the basics of computer science concepts. I’m your host Saron, Founder of CodeNewbie.

[00:00:09] VJ: And I’m Vaidehi Joshi, Author and Developer.

[00:00:12] SY: And she is the brilliant mind behind the Base.cs Blog Series. Today we are continuing our conversation on.

[00:00:18] VJ: Radix Trees.

[00:00:20] SY: This season of the Base.cs Podcast is brought to you by Square. If you’re designing a website or building a mobile app, you’re probably going to want to take payments at some point. Square APIs allow you to easily implement their payment form on your website. And with their In-App Payments SDK, you can add it to your mobile app too. You don’t have to worry about dealing with PCI compliance because they’ll take care of that for you. Don’t let anything come between you and your money. Start building with Square over at squareup.com/go/codenewbie.

[00:00:55] SY: Let's do a quick recap of radix trees. What are they?

[00:00:58] VJ: Radix trees as we've talked about in previous episodes are basically kind of compressed trie data structure.

[00:01:07] SY: Yes.

[00:01:08] VJ: And we've also known them by other names, like compact prefix tree. Basically all they are, are space optimized versions of those trie data structures and the big difference between radix trees and tries is that you can contain the entire sequence of a string within a single node of the radix tree rather than just…

[00:01:29] SY: Which is very cool.

[00:01:30] VJ: Yeah, it's especially cool because when you compare it to a trie you can only contain a single character element. So rather than just containing one character, you can kind of compress it and contain a substring and that's what makes a radix tree.

[00:01:44] SY: Tries are that thing that allows us to do alphabets, right? They allow us to have all the alphabets and to save keys as alphabets.

[00:01:52] VJ: Yeah. They allow us to save key value pairs where the key is some sort of string and that's where the alphabets come in and the value is something associated with the key and they are sort of like a little bit related to hash tables in the sense that you have a key and you want to retrieve a value, you know where to go get it, but a trie and a radix tree are just a little bit different versions of that and different type of data structure.

[00:02:13] SY: Yeah. They’re a little bit funky. I like them though. I like that they have letters involved. That's very important to me. Okay, radix, let's talk about the word radix. I think we mentioned this in a previous episode maybe. Refresh my memory, what is a radix?

[00:02:26] VJ: So a radix is exactly what we have kind of encountered previously when we were talking about number systems. So we talked about like Base 2, Base 10. The base of a counting system is basically the number of unique digits in that system. So we've been calling it base a lot, but really that's also just the radix of that number system.

[00:02:46] SY: We talked about radix trees and we talked about how radix trees have many other names, why is it called a radix tree? What does this tree or this trie structure? What does that have to do with base anything?

[00:02:58] VJ: The radix part of the radix tree has to do with how it’s kind of interpreted by our machine, the way that it's sort of read and parsed. And in order to really understand the radix of the tree, you have to think about how the keys inside of that tree are read.

[00:03:20] SY: Read by the machine?

[00:03:20] VJ: Exactly, yes, how they're read, how they're interpreted by the machine when it walks through the tree, when it tries to retrieve a value. And in a radix tree, the keys, which is basically those strings, they're read in bits, which we’ll recall from back in Season. 1. Bits are binary digits. They’re single digits inside of a byte. So when you're looking at a radix tree and you're reading the keys, you're basically reading them in a certain amount of bits at a time. So you could say, “I'm going to read it a byte at a time. I'm going to read it half a byte at a time. I'm going to read it to bits at a time,” and that tells us like the way that I'm reading this tree, the number of bits that I'm reading at a time, that's where that radix comes from because that's the base number of bits that you can process while reading the keys of the tree data structure.

[00:04:08] SY: So when we talk about bits and bytes and binary and all that, we're talking about a Base 2.

[00:04:13] VJ: Correct. Yes.

[00:04:14] SY: Okay. So can we walk through maybe an example of that or explaining how that actually works?

[00:04:20] VJ: Yeah. Before we get into an example, I just want to point out that you could basically read the keys of a radix tree in different ways, right? I think we should do an example that's kind of a little bit simpler and let's just say that we have a radix tree data structure and we are just going to process the keys. We're just going to read the keys one bit at a time. Let’s keep it real simple.

[00:04:43] SY: I'm down for simple.

[00:04:43] VJ: Oh, yeah. All we can do is read it one bit at a time because, you know, we'll remember bits are just zeros and ones.

[00:04:49] SY: Because we’re busy. We have other things to do. Just one at a time, please.

[00:04:52] VJ: I know, which is funny that you're like, “We're busy.” But that's a lot of zeros and ones to read.

[00:04:56] SY: That’s true.

[00:04:58] VJ: And if history has shown us anything, we get real tired of going through zeros and ones on this podcast.

[00:05:04] SY: Maybe the problem is we have too much time on our hands.

[00:05:08] VJ: In any case, there's actually a name for this type of radix tree, the one where you process keys one bit at a time and that radix tree is called “PATRICIA Tree”. That’s spelled P-A-T-R-I-C-I-A. It actually stands for Practical Algorithm to Retrieve Information Coded in Alphanumeric.

[00:05:31] SY: Oh, that got complicated real fast. I was like PATRICIA is a nice name. It’s very elegant, but still accessible. It’s got good nicknames, Pat, Patty. You know what I mean? There’s like lots of things you can do, but no, it actually stands for something that sounds a little intimidating. That's okay though.

[00:05:49] VJ: I think we can make it a little less intimidating by sort of breaking down what it is. And if we break it down, we’ll realize it's actually not too bad. I promise. So we have this PATRICIA tree, which is a radix tree basically, and there are really only two things to remember about it, two important things. So first, when you have a PATRICIA tree and you're reading the keys in it, you're trying to retrieve the keys out of it, when you're reading it, you are only reading one bit at a time. That's something we knew because I just told you about that earlier. We're reading keys in streams of bit and basically what that means is as we read, we compare one bit at a time.

[00:06:26] SY: Wait. So does that mean that in every comparison, I'm just comparing two bits basically because I'm comparing one bit to another bit?

[00:06:33] VJ: Yes. We're basically like looking at one bit at a time and when you're comparing it to something else, you're comparing one bit to another bit.

[00:06:39] SY: So at any given time, I can only really look at two bits.

[00:06:43] VJ: Correct. Yes.

[00:06:43] SY: Okay. Cool.

[00:06:44] VJ: And actually that's a good point because when you're thinking about the fact that there's only two bits, you can really think about it at any given time. When you're thinking about how this radix tree is going to be represented, every single node in a PATRICIA tree only has two possible children. So basically it has a two-way branch and that makes it a binary radix tree.

[00:07:06] SY: Oh, things are coming together. Worlds are colliding. Look at that. We’re back to binary trees. That's fun.

[00:07:14] VJ: And I think that the interesting thing about this is it's going to come together even more. We know that binary is zeros and ones, right?

[00:07:20] SY: Yes.

[00:07:20] VJ: And we know that this PATRICIA tree is going to be a radix tree that only has two branches and we're only going to like consider one bit at a time when we're reading and we can only like have two possible options, zeros and ones. So if you're looking at a binary tree, can you imagine how it's going to be structured with 0 and 1 as its children?

[00:07:42] SY: If I remember how binary trees work, there's a rule about how the left and right nodes relate to each other, right? So the left node has to be less than the right node. So if that is the case and I'm comparing my zero and my one, then I assume my zero would have to be my left node and my one would have to be my right node.

[00:08:02] VJ: Exactly. So every single, little pair of children, because I said we have a branch of two possible options, right? The zero and one, if we're going to follow the rule of a binary tree because it's a binary radix tree, then we have to have zero be on the left because it's smaller than one and one is going to be on the right. So the left node will be a value zero and the right node is going to be value one and it's only ever going to be zeros and one.

[00:08:26] SY: Okay. So that two-way branching all connected to the idea that it's binary, all connected to the radix, which is two, so all this is really coming together. I like this.

[00:08:36] VJ: I think what's interesting about it is with a PATRICIA tree, I know we were talking about like comparing bits, right? When we are looking at a PATRICIA tree, you're only going to compare a single bit at a time, which basically means as you walk down the tree and you read it, you're only looking at one bit at a time, which is why you only have those two branches because a bit has two values possibly, right? It could be zero or it could be one. So as you look at one bit at a time, every single time you come to a new bit, you're like, is it a zero? Is it a one? If it's a zero, I go to the left node. If it's a one, I go to the right node. That's sort of how we're going to walk through the radix tree. And the reason I want to clarify is I know earlier you were asking me about comparing bits. And with a PATRICIA tree, it is only one bit at a time, but we'll see how that comparison actually happens. Actually, maybe we can look at it now with an example.

[00:09:29] SY: Yeah, because when we were talking about tries previously, we were talking about Dog and Doge and Dogs and kind of how those create different nodes and how the nodes relate to each, et cetera, et cetera. So if we were going to let's take that array, let's take the array with the key, Dog. Key Doge, and key Dogs, and we were to translate it to this PATRICIA tree situation, what would that look like?

[00:09:55] VJ: Well, I think the first thing that we want to do is figure out what those keys look like in binary because we know that ultimately it is a PATRICIA tree so we have to read it in bits so we need to know what the bits are so we can figure out how to read these things because they will be in the tree, but in order to retrieve them and read our way through the tree, we have to deal with bits. So my favorite part, we are going to have to do some binary today.

[00:10:27] SY: Yes. Are you excited? You have to take a nice, big, deep breath. Are you ready?

[00:10:32] VJ: Oh, yeah. This reminds me of like back in like Season 1 when I was like, “Let's spell out our names.” And I was like, “Oh my God! I was so confident.”

[00:10:40] SY: You have so many letters.

[00:10:42] VJ: I know.

[00:10:43] SY: Oh, that do not translate well to binary.

[00:10:45] VJ: Well, today we're just going to deal with something a little bit shorter. I think we have three keys that we’re dealing with, Dog, Doge, and Dogs, and I have a suggestion. I think we should just go through the binary for Dog.

[00:10:59] SY: Okay, that makes sense. They all have Dog in common.

[00:11:01] VJ: Yes. And we’ll just say, “This is the binary for Dog and we'll worry about Doge and Dogs later.” So if we're going to turn Dog into binary, it is as follows: 01100100, 01101111, 01100111.

[00:11:23] SY: Yep, that makes sense. Sounds like Dog to me.

[00:11:26] VJ: Yeah. When you hear that, don't you just think Wolf? You'll notice the way I read it, I tried to make it obvious how it breaks into bytes because I read eight binary digits at a time and eight digits is one byte. So we have three bytes to represent the three characters D, O, G, Dog.

[00:11:45] SY: So now that we have that in binary, which is very fun, I really enjoyed that, what do we do with that?

[00:11:50] VJ: We can basically think about how we would represent this in the form of a radix tree and we know that radix trees are basically compressed versions of tries, right? So we could represent D, O, G since it's a substring of Doge and Dogs at one single node.

[00:12:10] SY: Okay.

[00:12:10] VJ: And it would just be the substring Dog and we know that you can read binary, read those three bytes and get to the string, Dog. Now the interesting thing is if we have a node Dog and we know that we have two words that are going to sort of stem from the word Dog, Doge and Dogs, now we need to consider how Doge and dogs are going to fit into the node Dog, which is again just a compressed version of a trie structure that we've talked about over the past few episodes.

[00:12:43] SY. Yes. Okay.

[00:12:44] VJ: I just want to reiterate, it's not anything different than what we've already done before.

[00:12:48] SY: That is comforting. Thank you.

[00:12:50] VJ: So let's just start with the word Doge first. Spoiler alert, the first three bytes of Doge are exactly the same as Dog.

[00:12:59] SY: Yes, because they both have D, O, G.

[00:13:01] VJ: Exactly. Things get interesting as soon as you get to the last bit that matches Dog because Doge has three bytes, but it actually has a fourth byte as well.

[00:13:11] SY: Which is for the letter E.

[00:13:12] VJ: Exactly, because it was another letter, so it needs more binary, it needs more bits to go to that space.

[00:13:19] SY: Yes.

[00:13:20] VJ: And so the moment you have to represent Doge, you have all the binary to represent the word Dog and now you have the letter E. You need more space for it. You need more binary for it so you need to add more binary digits. You need more bits in there. The way that you represent the letter E in binary is 01100101.

[00:13:41] SY: Okay. So it's basically imagine all the binary for Dog plus what I just said and that's how you represent Doge. So if I were to ask you, where does the binary for Doge and Dog diverge? Where do they become different? What would you say?

[00:13:56] SY: They become different after the last bit I guess of Dog because they become different at what is 3 times 8, 24, it’s basically at the 25th place is where we see a difference where one is blank and one is zero.

[00:14:13] VJ: Exactly. The 25th digit or basically bit number 25 because in Doge we have zero there, but in a Dog we have nothing because Dog ended with the 24th digit, the 24th bit. So basically at the point where Doge differs from Dog at bit 25, we need to look at whatever the value of that is in order to determine where to put Doge, where to put that string in relation to its substring Dog.

[00:14:47] SY: So we basically need to look at it and determine is this a zero or is this a one.

[00:14:51] VJ: Exactly. You can't write on it because there's only two options and remember you only have this left child and right child, right? We have like left for zero, right for one. So we only have two possible values that are going to be in that 25th bit digit place. So we need to look at what's there. Do you want me to tell you what's there again?

[00:15:09] SY: I think I can do it. I think I can do it. Okay. Okay. So we said E is 01100101.

[00:15:17] VJ: Yup.

[00:15:17] SY: So the first bit is zero. So we already said that if it's zero, it needs to be on the left side, and if it's a one, it needs to be on the right side. So because it's a zero, it's going to go in the left node.

[00:15:29] VJ: Exactly. So because it diverges at that bit and because that bit has a value, we can look at the value of the bit where it diverges and figure out where should our value Doge go.

[00:15:43] SY: Does it need to have a right node?

[00:15:46] VJ: It doesn't need to have a right node, but you can imagine if we had, I'm actually not even sure what kind of binary digit because it wouldn't even be a letter because letters can all fit within seven bits. But maybe you had like an emoji or like some other characters and thing, something that requires more than seven bits. You need like basically a one at bit 25. If you have a one at bit 25, then you know that you would have a right child. And actually, this is a great segue into what to do with Dogs.

[00:16:17] SY: Yeah.

[00:16:18] VJ: And I think what we need to do is the same thing again. We need to figure out how to represent Dogs in relation to Dog and figure out where that divergence is and then we'll know how to place it.

[00:16:29] SY: Okay. So in order to do that, we need to figure out what is Dogs in binary. So basically, we already have the Dog part. But what is the S in binary?

[00:16:38] VJ: The S in binary is 01110011.

[00:16:47] SY: Okay. So some of that feels a little familiar. How does that compare to the E in Doge?

[00:16:52] VJ: Good question. A lot of it is the same. There are some overlaps. So you'll remember dog was three bytes, Doge was four, Dogs is also four bytes, and some of the bits for that fourth letter are almost the same. So E in binary, the E from Doge is 01100101 versus S, which is 01110011.

[00:17:22] SY: Okay. So the first three numbers are the same.

[00:17:25] VJ: Exactly. Those first three bits for E and S in binary are the same. The point where they diverge is that fourth bit. So basically, the fourth bit of the last byte which would make it the 25th, 26th, 27th, 28th bit in this whole series.

[00:17:44] SY: So it's almost like even within E and S, there's like a substring there.

[00:17:50] VJ: Yes. There's a binary substring.

[00:17:53] SY: Oh, you can do that. Okay, that’s good to know.

[00:17:56] VJ: I don't actually know if they're called binary substrings, but that's the way I think about it.

[00:17:59] SY: They should be. You know what? We've invented that. We’ve decided it. So there is such a thing as binary substring, because yeah, I kind of thought, well, the whole byte needs to match up because we think of letters, like we don't think of E and S is overlapping in the English language. So that is fascinating that it still has some overlap, that it has some commonalities when you translate it into binary.

[00:18:23] VJ: It was good we start with Dog because it's kind of very clear to us as people who can speak, read and write English, D, O, G repeated, you’re like, “Oh, that's the same.” That is yeah clearly a substring, but in binary, it's not as obvious to us, right? Until I spelled out the fact that they both have zero as the first digit, one as the second digit, one as the next digit, and then the fourth digit is where things change, it wouldn't have been clear otherwise, but computers read in binary. So for them, that is basically a substring in numerical terms.

[00:18:56] SY: They don't care. They don't care about our English at all. Yeah. Okay. So up until the 28th bit, they are the same. So what happens when they start to diverge? What do we do with that?

[00:19:08] VJ: Earlier when we are looking at Dog and Doge, we saw that they diverged at bit 25.

[00:19:13] SY: Yes.

[00:19:13] VJ: So at bit 25, we added a left child, right?

[00:19:19] SY: Yes.

[00:19:19] VJ: So in this case, we know that Dogs diverges from Doge, because binary and because numbers, we know that they diverge at bit 28. So what we're going to do is at bit 28 where they diverge, that's where we will put the rest of the value for Dogs basically. I know there's a little bit hardness like to kind of reason about because you have to think about it in terms of like everything being represented bit by bit and it's a little bit hard for us to reason about and that's okay. I think something I also want to point out is that this PATRICIA tree thing is part of a larger algorithm that's way more complex than I'm even making it out to be, but this is fundamentally what it's doing. So I think we can just understand if we can try to wrap our heads around what it's kind of doing under the hood, I think that's a big win even if we can't necessarily visualize all of it.

[00:20:14] SY: Cool. I'll take it.

[00:20:15] VJ: So anyways, I guess we should put these Dogs in their correct place. At bit 28 where they diverge, we should look at what the divergence value is and at a child node as we deem appropriate. So in this case at bit 28, I'll just save us all the misery of going through the binary again and I'll just tell you. At bit 28, Dogs, the letter S diverges and it has a one versus Doge which has a zero at the 28. So basically at that point, we will look at the value of Dogs at that bit and we'll say, “Oh, it's a one.”

[00:20:52] SY: So we put it to the right.

[00:20:53] VJ: Exactly. We know to put it to the right because it’s only zeros and ones so we have to put it to the right. And that's how we can basically read the PATRICIA tree and interpret the strings inside of it from a binary perspective.

[00:21:09] SY: Okay. So if the 28th bit for dogs is a one, that's going off to the right, the right node, then that means that the 28th bit in Doge, which is a zero, has to go to the left.

[00:21:25] VJ: Yes. And then for the rest of the bits in Doge, so for example, the next bit in Doge is actually a zero, that's going to go to the left also. You can kind of imagine this tree as literally just zeros and ones. And as you read the bits in a letter or in a word, you basically are looking at the left child or the right child. So you could read the bits for Doge and keep going up until the 28th bit and then suddenly, you could, at the 28th bit, you could just start reading Dogs. And by reading, I mean, you're reading the binary or a representation for it.

[00:22:07] SY: I love reading Dogs. It’s my favorite weekend activity.

[00:22:11] VJ: You're basically just traversing down the tree in the same way like when we were looking at tries and we were kind of reading those and we were like, “Oh, look, this word is a subset of this other word and this string can take us down three different paths, three different endings.” In the same way when you're dealing with it in a radix tree, it's the same idea where you can follow the same bits down a single path. And then when there's a divergence, that means you could read a different set of binary and end up with a different word. But really it's all just representations of the same thing in this case rather than strings and characters at each node we’re looking at zero or one.

[00:22:50] SY: What do we do with this information? What should we take away from all of these bits and bytes and binary and all the PATRICIAs? What do we do with PATRICIA? Where do we put it? Where it should go?

[00:23:02] VJ: As far as PATRICIA goes, we can learn more about that algorithm, which as I mentioned we've done a simplified version of this algorithm and this is actually like a super computer science-y like, you know, people have spent a lot of time thinking about how to represent these trees from a very low level perspective, which is why in this episode we're getting into binary and bytes. But I guess the reason that I think they're interesting to learn about is because this is sort of our first foray into more complex tree data structures. And with the radix tree, with tries, with PATRICIA trees, which are all building on each other, we basically have come really far from when we first started with just trees and binary trees.

[00:23:48] SY: Right. Oh, yeah. We've grown all the way up.

[00:23:50] VJ: Yeah. We've gone from just like, “Oh, I can only represent numbers. Oh, I can only have two children.”

[00:23:57] SY: Yeah.

[00:23:58] VJ: But now we are like, “Oh!”

[00:24:00] SY: You can have all the babies. Yeah.

[00:24:04] VJ: Now we can reproduce that will, I guess, in a more complicated way. But I guess this is really exciting because like going forward in our next season, we're going to start getting into more complicated tree structures, but we can sort of start to have a little bit of foreshadowing that they're all following a lot of the same rules, what makes them different is how they are represented, what they're used for, but they're ultimately just the same fundamental structures that we've been dealing with that have been sort of modified for certain needs. That was the cool thing with tries, right? We saw how they can be used to represent words and the same thing with compressed radix trees. And so we're going to see more of that going forward and this is just sort of our first little tasting of what's to come.

[00:24:59] SY: Very cool. And that’s the end of today’s show. If you liked what you heard, please leave us a review and make sure to check out Vaidehi’s blog post. The link to that is in your show notes. Also want to give a huge shout-out to Square. Square has APIs and SDKs to make taking payments easy whether you’re building a mobile app in iOS, Android, Flutter, or React Native. If you want to embed a checkout experience into your website, it’s super easy. Square has processed billions of transactions so you know it’s tried and true. Start building with Square over at squareup.com/go/codenewbie. This episode was edited and mixed by Levi Sharpe.

[00:25:38] VJ: Bye everyone.

[00:25:39] SY: Thanks for listening. See you next week.

Thank you for supporting the show!