Reviewing Agent Code
Modern coding agents are able to churn out an astonishing amount of code in very short time. This can be overwhelming and the urge to just accept the code without reading is real. We want to share a couple of tricks on how you can keep up, because we are strong believers that AI code still needs a human review.
TLDR - too long, didn't read
- Get a UI that you can use to compare the git changes between the first AI commit and the current working directory
- Go through all changes and if you don't like something, add a comment
- Tell your agent to find the comments, group them into tasks and work on them.
- Repeat until you are happy with the changes.
Let's start with an example
To have a real use case, I gave my agent of choice (GitHub Copilot with GPT-5.4) the following prompt:
I want a very small todo app. Do it in the following steps and make sure to commit after every step
- Init git and add a readme
- init an npm project and setup the vite/react/typescript via the vite cli
- Setup tailwind
- Build a first version of the todo app
- Add persistence via localstorage
We don't need tests for now.
This gives us the following commit history:
)
A todo app in only 5 commits!
As you can see, we now have a small todo app in 5 commits. For me, letting the agent create multiple commits makes a lot of sense, since it can do a large chunk of work, while you are still able to understand the thought process of the agent. You don't even have to tell the agent the concrete steps. As long as you tell it to split up its work into smaller chunks, you should be able to get a similar result.
The git UI
Now, on to the review part. For me, the most important thing is, being able to SEE on a high level what the agent did. For this, I'm using the Git Graph VSCode extension. With this, you can open up this large graph that gives you a nice view over the project's history. And when selecting a commit or a range of commits (with cmd + click), you can see the changed files in a nice little file tree:
)
See the exact changes of a file.
When clicking on one of the files, you will get the diff view that shows, what this specific commit did in the file.
Now, you could go through each of the files, look at the diff of the file and write a large prompt containing all the changes. One problem is, that you can't really interact with the code, as the two versions (before and after) don't exist in your working directory. So TypeScript is not available, go to definition does not work properly, and you can't make any quick inline changes like fixing a commit.
A better process
For me, the perfect solution to these problems are the comparison between a commit and the working tree. To accomplish this with Git Graph, you need to have at least one unstaged change in any file. Normally, I just temporarily add a letter in the README.md file. Then, Git Graph allows you to compare any commit to your current file system, by first clicking on the commit BEFORE any of the commits to be reviewed, and then, doing a cmd + click on the top row. This should yield this view:
)
This shows all changes that happened after commit a20a4179
Now, you can go through the changed files, use "Go to definition", hover over variables to see their type and you can change things.
Now, my favorite thing has become to leave comments for the agent:
)
We left a regular pull request comment, directly in the code.
Now we go through all changed files since the initial commit, look at the changes the agent made, and leave commits where we have remarks. We can even leave questions that the agents should answer like: "I dislike that we use a useEffect here, can we do it differently?"
Once, we added the comments, all that's left is to write one last prompt:
Go through the code and find comments marked with "AGENT-TODO". Scan them, group them by topic, answer any questions (with suggestions if possible) and propose an implementation plan with sub steps that you can then fix in multiple commits.
A regular code review with an agentic developer
In the end, my goal was to find a process that combines the best of both worlds: Using the agent to write the code, while still being able to control and affect the outcome to my desire. By Adding the comment in directly in the code and by being able to make small changes myself, I can skip a much slower pull request process, especially because the VSCode UI is much better for code reviews when compared to something like GitHub.
If you now want to go one step further, you can hand the created tasks to your task management system of choice and let a ralph loop handle the implementation. But that is a topic for a different article.
)
)
)
)
)
)
)
)
)