Description

We end our section of the DFS algorithm with a discussion on DAGs (directed acyclic graphs), because most implementations of depth-first search will check to see if any cycles exist, and a large part of that is based on the DFS algorithm checking to see whether or not a graph is a directed acyclic graph. DAGs are also somewhat infamous in computer science because they’re pretty much everywhere in sofware. For example, a directed acyclic graph is the backbone of applications that handle scheduling for systems of tasks or handling jobs — especially those that need to be processed in a particular order. So let's dig into DAGs!

Show Notes

This episode of the Basecs podcast is based on Vaidehi's blog post, Spinning Around In Cycles With Directed Acyclic Graphs from her basecs blog series.

Transcript

[00:00:02] SY: Welcome to the Base.cs Podcast where we explore the basics of computer science concepts. I’m your host Saron, Founder of CodeNewbie.

[00:00:09] VJ: And I’m Vaidehi Joshi, Author and Developer.

[00:00:12] SY: And she is the brilliant mind behind the Base.cs Blog Series. Today, we’re talking about.

[00:00:17] VJ: Cycles in graphs.

[00:00:19] SY: This season of the Base.cs Podcast is brought to you by Educative. Educative has a ton of online coding courses, including how to become a machine-learning engineer and a practical guide to Kubernetes. The comprehensive and interactive text-based courses give you in demand tech skills. And since it’s text-based, there’s no scrubbing back and forth video to find the content you need. So learn better and faster with Educative and get a 20% site-wide discount by going to educative.io/basecs.

[00:00:55] Okay. So when it comes to graphs, what is a cycle?

[00:00:59] VJ: So a cycle in a graph is basically a way of going from one node back to somewhere that we came from.

[00:01:10] SY: Okay.

[00:01:10] VJ: And we sort of touched on this a little bit last episode when we talked about self-loops and backward edges where basically you can have an edge that either takes you back to where you came from or back to yourself where yourself is like a node. So if you’re traversing a graph and you suddenly realize that you are on an edge that takes you back to somewhere you’ve been, you’ve basically encountered a cycle.

[00:01:41] SY: Okay. So in a graph we have, is it just called a cyclic graph if it has a cycle in it?

[00:01:48] VJ: Yes. So when you have a graph that does have a cycle, it’s a cyclic graph. And on the other hand, when you have a graph that has no cycles in it, that’s known as a “acyclic graph”.

[00:02:04] SY: Very creative.

[00:02:05] VJ: Yes. Yeah, they’re kind of what you would expect in terms of names.

[00:02:09] SY: No surprises. Yeah.

[00:02:11] VJ: But when it comes to a cyclic graph, you just need at least one cycle in the graph and that is enough to categorize it as a cyclic graph. Just one cycle is necessary.

[00:02:21] SY: So recently, we’ve been talking a ton about DFS, Depth-First Search. Is there any connection between DFS and cycles?

[00:02:30] VJ: That’s a good question. So there actually is. In fact, we can use the depth-first search algorithm to help us detect if a graph even contains a cycle and that’s specifically true with graphs that are directed, which means that they have directed edges. So when you have a directed graph, you can use DFS to figure out if that directed graph also has a cycle in it.

[00:03:03] SY: Why is that important? What What’s the point of having these cycles? What do they do for us? Why do we care?

[00:03:09] VJ: So a lot of the times what we actually care about is sort of the opposite and what I mean by that is we care that our graph doesn’t have a cycle. A lot of the times we don’t want any cycles in our graph because there’s a specific type of data structure that is called a directed acyclic graph, which is also known as a DAG.

[00:03:34] SY: DAG.

[00:03:38] VJ: Yes. You have to say it exactly like that.

[00:03:39] SY: Just like that.

[00:03:41] VJ: And a lot of the times we use directed acyclic graphs to implement other things. In fact, they’re pretty common in computer science and you usually, if you have a graph that you’re traversing through and you want to use a DAG, you might want to check that it doesn’t have any cycles in it or you might have a directed graph and you want to see if there are cycles in it, in which case you would know you’re not dealing with a DAG, you’re dealing with a directed cyclic graph. A directed cyclic graph, a DCG, instead of a DAG.

[00:04:23] SY: Oh my God! That was amazing.

[00:04:25] VJ: This is why nobody likes DAGs. They’re hard to say when you use abbreviations.

[00:04:32] SY: Oh, man. Okay. So we care about these things called DAGs. Where would I see a DAG? Where does it come into play?

[00:04:41] VJ: So I mentioned that they’re sort of allover computer science. A couple examples. A DAG is basically used in dependency graphs. So they’re kind of like a fundamental concept when it comes to resolving dependencies or handling, updating of packages, like if you’ve used a package manager or something like a Gemfile if you use Ruby or NPM or Yarn, like all of those libraries, they’re all managing packages and they all care about which packages are dependent on one another. And so if you think about each package as a node in a graph and the various dependencies of all the different packages like one might rely on another and another might rely on a third and a fourth. And if you sort of visualized it, it would basically be a directed acyclic graph because each of the dependencies would be a node in the graph and the dependency that is dependent on something else, a different package, would be like the edge that connects them. And it’s important that those don’t have any cycles because you would not want one dependency that you’re reliant upon to be dependent on something that it relies upon.

[00:05:57] SY: Right.

[00:05:58] VJ: Now you have a cycle and then it’s like how do you resolve that. Seems bad, especially if one had like some sort of update and it was like not backwards compatible or if it had like a security issue, like I don’t even know what you would do, but this is why you have DAGs.

[00:06:16] SY: Wow. Okay.

[00:06:18] VJ: And they’re also in like job scheduling and task scheduling when you have processing in an application and also state machines as well. There are certain times when you’re trying to track the state of some data or some object and you don’t want it to go back to a prior state, which means you don’t want it to be able to reverse back to somewhere that it came from, which is just a DAG.

[00:06:41] SY: Yeah, because if it were to refer back to where it came from, it would be a cycle and we don’t want that.

[00:06:46] VJ: Yeah, and if you like imagine something like job scheduling, you wouldn’t want a job to like have a self-loop or a backward edge because then it could just schedule itself infinitely and that seems bad.

[00:06:57] SY: That’d be really bad. Oh, boy. Okay. So I totally get the DAGs are super important, but we said earlier that we can use DFS to detect cycles in a graph, basically like making sure that the DAG that we think we have is actually a DAG. How do we do that?

[00:07:15] VJ: We talked about DFS using a stack data structure under the hood, right?

[00:07:20] SY: Yes.

[00:07:21] VJ: And in previous episodes, we talked about how as you sort of go deep into the graph and you go one node at a time and you visit the child of a child of a child, you just keep adding a node on top of the stack. So what that means is as you start visiting one node, you don’t really finish visiting it until you hit the last possible child, which means you hit a dead end and then you can pop all the nodes off the stack.

[00:07:48] SY: Right.

[00:07:49] VJ: But if you’re using a stack data structure under the hood and you’re using depth-first search, you can effectively keep track of which node you are searching through to begin with. So what I mean by that is if you start looking at a node before you really start looking at its children and going deep down the DFS path, you can basically like add a little flag or a note to yourself to know, “Oh, I’m looking through Node U.” And this is the first node I’m starting with and this is the one that’s being processed. And so it’s kind of like a little reminder to yourself that this is the node that I’m starting with. And as you keep looking at its child and then the child of its child as you run DFS, we’ll recall that as we visit a node we look at who its children are, right? And we pick a child and we go down deep into that path. So if you go and visit a node and you notice, “Hey, this node has an edge that goes back to the node we started with. Wait a second. We’re already processing that node. We’re already visiting it. How come we already have an edge that’s pointing back to it?” That edge tells us there’s a cycle in the graph because you basically have something that references where you started from.

[00:09:13] SY: Okay. So it’s those little flags that help us see that.

[00:09:16] VJ: Yeah. And we can sort of like have a flag to help us see where we started from, the node that’s currently being processed, and you can also do that for pretty much any node you’re looking at internally because if you think about it, if I’ve added a node onto my stack and I’m looking at its great-great-great-great-great-great-grandchild and I noticed that that great-great-great-great-grandchild has a reference to one of the nodes already in the stack, then that tells me I’ve already looked at something that it’s referencing. I got another edge that takes me back to where I came from and that tells me, “Oh, I’ve got a backward edge. I can go back to something that’s in the stack, which means I have a cycle.”

[00:10:02] SY: Okay. So basically we can use DFS to see if there’s a cycle. And if we do have one, then we know we don’t have a DAG, in which we’re very sad because we like DAGs.

[00:10:14] VJ: Yeah. We’re sad without DAGs.

[00:10:16] SY: Sad without DAGs. Okay. So this episode, we talked about cycles, we talked about cyclic versus acyclic, we talked about DAGs, which is our new tool that we think is really cool and that is used everywhere.

[00:10:29] VJ: Our new best friend.

[00:10:30] SY: Our new best friend. That’s right. And it feels like we’ve spent a lot of time in the DFS world. We’ve touched on like a bunch of things and I think this is the last of depth-first search for this show.

[00:10:46] VJ: I know. It’s so sad, but it’s okay.

[00:10:48] SY: Oh, man! Oh we got to move on, right? We got to move on to other algorithms and other topics. We spent a good chunk of time just digging into DFS and seeing all there is that it can do. So go DFS.

[00:11:01] VJ: Yeah. So much to learn. You can go so deep into DFS.

[00:11:10] SY: And that’ll be the last deep joke we make on DFS. And that’s the end of today’s show. If you liked what you heard, please leave us a review and make sure to check out Vaidehi’s blog post. Link to that is in your show notes. I also want to give a huge shout-out to Educative, the text-based online coding course platform. My producer and I got a chance to give it a try. I took “become a front-end engineer”. And Levi, what did you take?

[00:11:37] LS: I took “Learn Python from scratch”.

[00:11:41] SY: And what do you think?

[00:11:41] LS: I thought it was really great. So I’ve been learning Python a little bit and this sort of like…

[00:11:46] SY: Because I’m making you.

[00:11:48] LS: Because you’re making me. True. You’re not making me learn Python specifically.

[00:11:51] SY: That’s true.

[00:11:52] LS: But it is what I chose. So I was a little bit familiar, but I found that these courses, they’re laid out in a really intuitive way, there’s like the sections that lead seamlessly one into the other, the Python 1 goes from data types and variables, to conditional statements, functions, loops, and then each section has quiz, to make sure that you’re not just blasting through and like the information is going one way and not the other.

[00:12:21] SY: I love the quizzes.

[00:12:23] LS: Yeah. It really called me out on my BS.

[00:12:24] SY: Yes.

[00:12:26] LS: I was like, “Yeah, yeah, I get it. I get it. I get it.” And then I took the quiz and they’re like, “You don’t get it.” And I was like, “Aha!”

[00:12:33] SY: You got me.

[00:12:33] LS: You got me. And then throughout all of these different sorts of sections, you can code within the service itself. So you don’t need an external coding thing. I should know what that’s called. Do you know what that’s called?

[00:12:49] SY: I’m not going to tell you. I’m not going to give you -- it’s an IDE.

[00:12:51] LS: No. Yes. That’s it, an IDE. Did you know I’m a producer for a coding podcast?

[00:12:58] SY: A tactical podcast.

[00:12:59] LS: Yeah.

[00:12:59] SY: Two actually, two technical podcasts.

[00:13:01] LS: That’s true.

[00:13:02] SY: So if you are interested in learning how to code, make sure to check out Educative and you can actually get 20% off by going to educative.io/basecs. That’s B-A-S-E-C-S. Check it out. This episode was edited and mix by Levi Sharpe.

[00:13:20] VJ: Bye everyone.

[00:13:21] SY: Thanks for listening. See you next week.

Thank you for supporting the show!

Thank you for supporting the show!