How to Audit Your Code With AST Programming

Create a custom script to audit a codebase

Project Source Code

Get the project source code below, and follow along with the lesson material.

Download Project Source Code

To set up the project on your local machine, please follow the directions provided in the README.md file. If you run into any issues with running the project source code, then feel free to reach out to the author in the course's Discord channel.

This lesson preview is part of the Practical Abstract Syntax Trees course and can be unlocked immediately with a \newline Pro subscription or a single-time purchase. Already have access to this course? Log in here.

This video is available to students only
Unlock This Course

Get unlimited access to Practical Abstract Syntax Trees, plus 70+ \newline books, guides and courses with the \newline Pro subscription.

Thumbnail for the \newline course Practical Abstract Syntax Trees
  • [00:00 - 00:13] As you looked through the codebase, you likely noticed several usages of button elements. Since this specific codebase is React, a common pattern is to create a single button component that can be reused in all these places.

    [00:14 - 00:23] This makes it easier to change the internal implementation in one place and provides consistency. Let's walk through using AST-based tooling to make this change.

    [00:24 - 00:35] The first step in this process is understanding the existing codebase. To create this new component, we need some information about the current button us ages. What kind of styles or variants need to be supported?

    [00:36 - 00:44] What kind of props or options or functionality needs to be supported? In a smaller codebase like this, it might be tempting to manually look through and read all the files.

    [00:45 - 00:54] But let's try using AST-based tooling to answer these questions. AST-based tooling uses many of the same steps as you would, as if you were manually auditing the codebase to answer these questions.

    [00:55 - 01:02] If doing this code audit manually, the first step would be to locate the relevant files. Then, you would need to find any button elements.

    [01:03 - 01:13] And finally, you would look at the props or options and styles to discover what the button component needs to support. A custom script using AST can follow these same steps, but automate it.

    [01:14 - 01:19] Let's start by creating a new project. First, create a new directory for the audit script.

    [01:20 - 01:32] Ideally, this directory will be a sibling to the flashcards directory, since it will be referenced in the code below. Then, change into the new audit directory and initialize a new package.json file.

    [01:33 - 01:40] Finally, we'll add the necessary dependencies. For this audit script, we'll use the Babel parser and Babel traverse package.

    [01:41 - 01:48] We'll use glob, which we'll cover in a moment, in TS node and TypeScript for writing the audit script and TypeScript. And finally, the type packages.

    [01:49 - 02:03] In this case, we need the types for the Babel traverse package, because the Babel parser package already includes first party types, as well as the types for the glob package. Then, create a new file named audit.ts.

    [02:04 - 02:21] Let's start by importing from FS the node file system package and glob from the glob package. The first step is to find all relevant files.

    [02:22 - 02:32] For this codebase, that's all files in the source components directory. As mentioned earlier, let's assume that the flashcards directory is a sibling to this current directory.

    [02:33 - 02:43] If this isn't the case, use a slightly different path to match that. Then, we'll use the glob package to synchronously find all of the files that match the given glob in the codebase.

    [02:44 - 02:51] This double star will match any number of directories. This single star.js will match all of our component files.

    [02:52 - 03:05] So this method will return all of the files in our codebase that match this gl ob pattern, and these are the files that we want to search. Next, let's loop over each of these files and read the file contents.

    [03:06 - 03:18] Here, we're using the file system read file sync method to read the file contents and convert that to a string. For now, we're printing that to the terminal.

    [03:19 - 03:34] Running the script will print the file contents of all of the relevant files in the flashcards codebase. This completes the first step of finding all of the relevant files.

    [03:35 - 03:40] The next step is to find the button elements. This is where an AST becomes helpful.

    [03:41 - 03:48] The previous parsing and traversal script can be adapted for this. We first need to know which nodes to traverse in the AST.

    [03:49 - 03:58] This is where AST Explorer comes in handy. Open one of the files in the flashcards codebase and copy one of the button snippets.

    [03:59 - 04:05] Note that we chose a small piece of code instead of the entire file. This makes it easier to explore the generated tree.

    [04:06 - 04:22] While we chose this specific snippet, any of the button uses could have been chosen. Clicking directly on the opening button element tag in the AST Explorer's editor should highlight the JSX identifier node, which contains the elements name button.

    [04:23 - 04:35] Its parent is a JSX opening element with an attributes property. This property contains an array of all the attributes or props associated with this element.

    [04:36 - 04:45] The JSX attribute nodes will all appear similar, but inspect the name and value properties of each. These will contain nodes that represent the name of the prop and the value.

    [04:46 - 05:01] These JSX attribute nodes define the names of the props used so we can see which element attributes and functionality needs to be supported. Additionally, we can inspect the class names used in the class name prop to determine which kind of styles need to be supported.

    [05:02 - 05:19] Now that we know how to find the only element named button, let's update the script to use Babel parser to generate an AST for the file contents, then Babel traverse to target only the button element nodes. Let's start by importing the parse and traverse functions.

    [05:20 - 05:40] Now let's use the parse function to parse the contents of the file into an AST. It will also require a little bit of additional configuration because now we're parsing files with JSX, so we'll need to add the JSX plugin.

    [05:41 - 05:57] Additionally, all of these files are now modules, so we'll also need to define the source type of module. Now that we have an AST, we can use the traverse package to find JSX opening element nodes.

    [05:58 - 06:12] And within, we'll add a conditional. To check that this node's name is the JSX identifier node, and that this node's name is button.

    [06:13 - 06:30] You'll start to notice the benefit of TypeScript, providing some type ahead in Type Safety while traversing these nodes. Just as in the previous lesson, the script is using Babel parser to convert the file contents into an AST.

    [06:31 - 06:44] The script is now doing something similar that we just did in AST Explorer. It's traversing the resulting AST in finding JSX opening element nodes, where the name is the JSX identifier node with the name property with the value of button.

    [06:45 - 07:02] Now the script will print any button element nodes anywhere in the files. Now that we've found the files and have found the button elements, the final thing left to do is look at the specific props and styles.

    [07:03 - 07:13] As mentioned, all the props are in an array on the attributes property on the JSX opening element node. This script can aggregate this data to make it easier to understand.

    [07:14 - 07:32] Let's start by creating a new global object that will keep track of all the props and values used, while also to find a type for it, a record, with each key being a string and each value being an array of strings. Each key will represent a prop, and the value will represent all values used with that prop.

    [07:33 - 07:44] Now we can update the script to loop through each of the attributes. For the purpose of this script, we only care about JSX attribute nodes.

    [07:45 - 07:59] There can also be JSX spread attribute nodes, but that's not used with buttons in this codebase. Additionally, we only expect a name node that's a JSX identifier, so we're also checking that.

    [08:00 - 08:20] Now we can get the props name, which is attribute.name.name. Now we know the props name, but we need to get its value.

    [08:21 - 08:41] There are several different types of values that a JSX attribute can have. Again, for the purpose of this script, we only care about the values of string literals.

    [08:42 - 08:54] Otherwise, we can use the type of node. If the attribute's value is a string literal, then we use that string literals value.

    [08:55 - 09:11] Otherwise, for all other types, we just set the value to the type of node. If we wanted to know the values for all of the different attribute values, we get add conditionals for each type of value node, and logic to unwrap the value from each of those nodes.

    [09:12 - 09:24] Now that we have the prop name in the value, we can add this to the global object and keep track of it across all files. If the prop name already exists in the object, then we'll push the new value into the existing array.

    [09:25 - 09:53] Otherwise, if it doesn't exist, we'll instantiate a new array with only this value. Now we can add some logic to print all of the props and values from this global object.

    [09:54 - 10:20] We'll quickly print these values and provide some formatting using the console methods. This will loop through all of the props used with all button elements.

    [10:21 - 10:37] Then, for each prop, it'll print how many times that prop was used. Finally, it 'll print all of the values used with that prop. Running the script should now produce output.

    [10:38 - 10:58] Unfortunately, it looks like there were some type errors, so let's jump back into our editor and fix those. This type error is saying that the value node is possibly null. Let's add a conditional to check for this in early return.

    [10:59 - 11:13] This conditional fixes these type errors because it's no longer possible for the value to be null after this early return. Now we can rerun the audit script to see the output.

    [11:14 - 11:27] We can see that the output list how many times each prop is used on a button element and every value passed to that prop. For example, the type prop was used eight times with five of those being passed the value button and three of those passed the value submit.

    [11:28 - 11:36] There are different totals since not all props are always used with button elements. This audit provides the data we need to see how the button element is used today.

    [11:37 - 11:45] We can see that there's a type prop that needs to support both button and submit values. We can see there's an on click prop and all buttons pass a click handler function.

    [11:46 - 11:52] We also have a class name prop which contains multiple class names. Many have a button class name along with a button dash dash variant.

    [11:53 - 12:01] However, there are a few with class names like landing page button and nav logo button. When we look at these usages in the code base, we can see that they are concept ually different.

    [12:02 - 12:15] Therefore, we don't want to convert those to this new button component and want to ignore them from the audit. Since we don't want to touch buttons without the button class name, let's update the script to only consider button elements that also have that class name.

    [12:16 - 12:28] We can do this by looking through the attributes and finding the class name prop and checking that that class name props value has the button class name included. Thank you.

    [12:29 - 13:21] As we looked at earlier, this is looping through all of the nodes attributes. Similarly, we're only handling JSX attribute nodes with the name node of a JSX identifier, and where that node's name is class name, and that node's value is a string literal.

    [13:22 - 13:33] And that string literal includes the thumb string button separated by spaces. And finally, if there's no button class name, then we can early return since we don't want to consider it in the audit.

    [13:34 - 13:46] We can switch back to the terminal and run the script again. We can see the output is similar, but now it excludes any button elements that don't have a button class name.

    [13:47 - 13:57] Based on this data, we have a perfect spec for the new button component based on the existing usages. To summarize, we faced a problem where every usage of a button element needed to be updated.

    [13:58 - 14:06] As a first step, it was decided to create a single button component that could be reused. In order to create this component, the exact props and functionality needed to be known.

    [14:07 - 14:15] In a small code base, it would be straightforward to manually open a bunch of files, search for button elements, and see how they're used. In a large code base, this isn't realistic.

    [14:16 - 14:29] Instead, we automated this process with a custom script to audit our code. It read all of the relevant files, converted the source code into an AST, then traversed that AST to find button elements, and finally log the props it used.

    [14:30 - 14:38] From this data, we can create an API that will support all of the existing us ages. Next, we will implement the button component with this API.

    [14:39 - 14:51] Then we can replace all of the existing button elements with this component. This pattern of doing a code audit can be used for a wide variety of tasks, usually to answer questions like, "How often do we use X, or what options do we use with Y?"

    [14:52 - 14:59] If you have a larger code base, you could try answering a question like this. If not, try cloning a large open source project from GitHub.

    [15:00 - 15:11] Then, adapt the script to work in that code base and answer a slightly different question, such as, "How many times is a function class component used ?" "What parameters or prop values are usually passed to that function?"

    [15:12 - 15:18] "How many times is something imported?" "What are the most common string liter als?" "What is the average length of all the variable names in the code base?"

    [15:19 - 15:27] "How many variables are there on average in each file?" "Or feel free to come up with a different idea." The next lesson will add the button component based on our audit.

    [15:28 - 15:30] on it.