Building an AI Code Review Bot

Prerequisites

Before we start, you’ll need:

GitHub account with repo access
OpenAI API key (and the willingness to watch your credits evaporate)
Node.js 18+
Basic GitHub Actions knowledge
A mass-produced motivational desk poster (“Teamwork Makes The Dream Work” or similar)
At least one mass-produced motivational desk poster

What We’re Building

A GitHub bot that automatically reviews pull requests. It catches the obvious stuff, enforces style, and gives your human reviewers more time for the things that actually matter. You know, architecture discussions, bikeshedding over variable names, the important work.

Ive wanted something like this for ages. Every team I’ve worked on has the same problem: PRs sit in a queue while reviewers context-switch away from their own work, and by the time anyone looks, half the comments are “missing semicolon” or “unused import.” Tedious for everyone involved.

So lets build a bot that handles the boring bits.

The Approach

Create GitHub App
Build webhook handler
Parse PR diffs
Generate reviews with LLM
Post comments

Nothing too wild. The whole thing is surprisingly straightforward once you see it laid out.

Step 1: Create GitHub App

Go to GitHub, then Settings, then Developer Settings, then GitHub Apps
Create new app with permissions:
- Pull requests: Read & Write
- Contents: Read
Subscribe to events: Pull request, Pull request review
Generate private key and save

Keep that private key somewhere safe. Not in your repo. Not in a Slack message. You know who you are.

Picard facepalm at leaked private keys

Step 2: Set Up Project

mkdir ai-code-reviewer && cd ai-code-reviewer
npm init -y
npm install @octokit/app @octokit/webhooks openai express

Standard fare. Octokit for GitHub, OpenAI for the brains, Express because it’s there and it works.

// src/index.ts
import express from 'express';
import { App } from '@octokit/app';
import { createNodeMiddleware } from '@octokit/webhooks';
import { handlePullRequest } from './reviewer';

const app = new App({
  appId: process.env.GITHUB_APP_ID!,
  privateKey: process.env.GITHUB_PRIVATE_KEY!,
  webhooks: {
    secret: process.env.WEBHOOK_SECRET!,
  },
});

app.webhooks.on('pull_request.opened', handlePullRequest);
app.webhooks.on('pull_request.synchronize', handlePullRequest);

const server = express();
server.use(createNodeMiddleware(app.webhooks));
server.listen(3000);

We’re listening for two events: when a PR is opened and when new commits are pushed to it. The synchronize event is the one people forget about. Without it, your bot only reviews the initial push and then goes silent while the author frantically force-pushes fixes.

Step 3: Fetch PR Diff

// src/github.ts
import { Octokit } from '@octokit/rest';

interface FileDiff {
  filename: string;
  patch: string;
  additions: number;
  deletions: number;
}

export async function getPRDiff(
  octokit: Octokit,
  owner: string,
  repo: string,
  pullNumber: number
): Promise<FileDiff[]> {
  const { data: files } = await octokit.pulls.listFiles({
    owner,
    repo,
    pull_number: pullNumber,
  });

  return files
    .filter((file) => file.patch)
    .map((file) => ({
      filename: file.filename,
      patch: file.patch!,
      additions: file.additions,
      deletions: file.deletions,
    }));
}

export async function getFileContent(
  octokit: Octokit,
  owner: string,
  repo: string,
  path: string,
  ref: string
): Promise<string> {
  const { data } = await octokit.repos.getContent({
    owner,
    repo,
    path,
    ref,
  });

  if ('content' in data) {
    return Buffer.from(data.content, 'base64').toString('utf-8');
  }

  throw new Error('Not a file');
}

The .filter((file) => file.patch) is doing quiet but critical work here. Binary files dont have patches, and sending a PNG diff to an LLM is a fantastic way to burn tokens on absolute nonsense.

Step 4: Build the Reviewer

This is where it gets fun. We’re handing the diff to GPT-4o and asking it to be a code reviewer. The system prompt is doing most of the heavy lifting.

// src/reviewer.ts
import OpenAI from 'openai';
import { getPRDiff, getFileContent } from './github';

const openai = new OpenAI();

interface ReviewComment {
  path: string;
  line: number;
  body: string;
  severity: 'error' | 'warning' | 'suggestion';
}

const SYSTEM_PROMPT = `You are a code reviewer. Review the following diff and identify issues.

For each issue found, respond with JSON:
{
  "comments": [
    {
      "path": "filename",
      "line": line_number,
      "body": "description of issue and suggestion",
      "severity": "error|warning|suggestion"
    }
  ]
}

Focus on:
- Bugs and logic errors
- Security vulnerabilities
- Performance issues
- Code style problems
- Missing error handling

Be specific. Include code examples for fixes.
If the code looks good, return an empty comments array.`;

export async function reviewDiff(files: FileDiff[]): Promise<ReviewComment[]> {
  const diffText = files
    .map((f) => `### ${f.filename}\n\`\`\`diff\n${f.patch}\n\`\`\``)
    .join('\n\n');

  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages: [
      { role: 'system', content: SYSTEM_PROMPT },
      { role: 'user', content: diffText },
    ],
    response_format: { type: 'json_object' },
    temperature: 0,
  });

  const result = JSON.parse(response.choices[0].message.content!);
  return result.comments || [];
}

Setting temperature: 0 is important. You want deterministic, pedantic reviews, not creative writing. Nobody needs their code reviewer having an artistic moment.

When the bot reviews its own code

Step 5: Post Review Comments

// src/post-review.ts
import { Octokit } from '@octokit/rest';
import { ReviewComment } from './reviewer';

export async function postReview(
  octokit: Octokit,
  owner: string,
  repo: string,
  pullNumber: number,
  comments: ReviewComment[],
  commitId: string
): Promise<void> {
  if (comments.length === 0) {
    await octokit.pulls.createReview({
      owner,
      repo,
      pull_number: pullNumber,
      event: 'APPROVE',
      body: '✅ No issues found. LGTM!',
    });
    return;
  }

  const hasErrors = comments.some((c) => c.severity === 'error');

  await octokit.pulls.createReview({
    owner,
    repo,
    pull_number: pullNumber,
    commit_id: commitId,
    event: hasErrors ? 'REQUEST_CHANGES' : 'COMMENT',
    body: `AI Review found ${comments.length} issue(s)`,
    comments: comments.map((c) => ({
      path: c.path,
      line: c.line,
      body: formatComment(c),
    })),
  });
}

function formatComment(comment: ReviewComment): string {
  const icons = {
    error: '🔴',
    warning: '🟡',
    suggestion: '💡',
  };

  return `${icons[comment.severity]} **${comment.severity.toUpperCase()}**\n\n${comment.body}`;
}

The REQUEST_CHANGES vs COMMENT distinction matters. If the bot only finds suggestions and warnings, it comments but doesnt block the merge. Actual errors? It requests changes. This way your team can still merge when the bot is being overly cautious, but genuine problems get flagged properly.

Step 6: Wire It Together

// src/handler.ts
import { EmitterWebhookEvent } from '@octokit/webhooks';
import { getPRDiff } from './github';
import { reviewDiff } from './reviewer';
import { postReview } from './post-review';

export async function handlePullRequest(
  event: EmitterWebhookEvent<'pull_request'>
): Promise<void> {
  const { pull_request, repository, installation } = event.payload;

  if (!installation) return;

  const octokit = await app.getInstallationOctokit(installation.id);

  console.log(`Reviewing PR #${pull_request.number}`);

  const files = await getPRDiff(
    octokit,
    repository.owner.login,
    repository.name,
    pull_request.number
  );

  const relevantFiles = files.filter(
    (f) => f.filename.endsWith('.ts') ||
           f.filename.endsWith('.tsx') ||
           f.filename.endsWith('.js')
  );

  if (relevantFiles.length === 0) {
    console.log('No reviewable files');
    return;
  }

  const comments = await reviewDiff(relevantFiles);

  await postReview(
    octokit,
    repository.owner.login,
    repository.name,
    pull_request.number,
    comments,
    pull_request.head.sha
  );

  console.log(`Posted ${comments.length} comments`);
}

Notice we’re filtering to .ts, .tsx, and .js files only. You could expand this, but I’d recommend starting narrow. Reviewing markdown files or config YAML with an LLM is like hiring a Michelin-star chef to microwave your leftovers.

Step 7: GitHub Actions Alternative

If running a dedicated app feels like overkill (and honestly, for smaller teams it probably is), you can do the whole thing as a GitHub Action instead:

# .github/workflows/ai-review.yml
name: AI Code Review

on:
  pull_request:
    types: [opened, synchronize]

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Get diff
        run: |
          git diff origin/${{ github.base_ref }}...HEAD > diff.txt

      - name: Run AI review
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: npx ai-reviewer review --diff diff.txt --pr ${{ github.event.pull_request.number }}

Same idea, less infrastructure. The tradeoff is you lose the real-time webhook feel and it runs on Actions minutes instead. Pick your poison.

Step 8: Rate Limiting and Costs

This bit isnt glamorous but it’ll save you from a nasty surprise on your OpenAI bill.

const COST_PER_1K_INPUT = 0.01;
const COST_PER_1K_OUTPUT = 0.03;
const MAX_DIFF_LINES = 500;

function estimateCost(diffText: string): number {
  const inputTokens = diffText.length / 4;
  const outputTokens = 500;

  return (inputTokens / 1000 * COST_PER_1K_INPUT) +
         (outputTokens / 1000 * COST_PER_1K_OUTPUT);
}

function shouldReview(files: FileDiff[]): boolean {
  const totalLines = files.reduce((sum, f) => sum + f.additions + f.deletions, 0);
  return totalLines <= MAX_DIFF_LINES;
}

That 500-line cap is generous but sensible. Massive PRs should be broken up anyway. If someone opens a 2,000-line PR, the bot politely declining to review it might be the best feedback it could give.

That would be great if you could keep your PRs under 500 lines

The Result

What you end up with:

Automatic first-pass code review on every PR
Common issues caught before a human even looks
Consistent feedback across the whole team
Configurable severity levels so you can tune the noise
Scales effortlessly as your team grows

When the bot catches a bug before your reviewer does

What I’d Do Differently

Add context about the codebase. Generic reviews miss project-specific patterns and conventions. Including your README, style guide, and architecture docs in the prompt makes a massive difference. Without that context, the bot is reviewing code in a vacuum, and its suggestions will feel generic because they are.

The other thing I’d change is adding a feedback loop. Let reviewers react to bot comments with thumbs up or down, then use that signal to refine the prompt over time. A bot that learns what your team actually cares about is worth ten times more than one running a stock prompt.

AI code review isnt replacing humans. It’s handling the tedious bits so your reviewers can focus on design, architecture, and arguing about whether that variable should be called data or result.

Navigation

Building an AI Code Review Bot

Prerequisites

What We’re Building

The Approach

Step 1: Create GitHub App

Step 2: Set Up Project

Step 3: Fetch PR Diff

Step 4: Build the Reviewer

Step 5: Post Review Comments

Step 6: Wire It Together

Step 7: GitHub Actions Alternative

Step 8: Rate Limiting and Costs

The Result

What I’d Do Differently

Related Posts

Building Custom GPTs for Your Workflow

Building a Notes MCP Server for Claude Desktop

Creation Is Art, Debugging Is Science

Comments

Navigation

Prerequisites

What We’re Building

The Approach

Step 1: Create GitHub App

Step 2: Set Up Project

Step 3: Fetch PR Diff

Step 4: Build the Reviewer

Step 5: Post Review Comments

Step 6: Wire It Together

Step 7: GitHub Actions Alternative

Step 8: Rate Limiting and Costs

The Result

What I’d Do Differently

Stay updated

Related Posts

Building Custom GPTs for Your Workflow

Building a Notes MCP Server for Claude Desktop

Creation Is Art, Debugging Is Science

Comments