working-notes

Reviewing AI-Assisted Code Changes

When AI changes a lot of code quickly, review is no longer about reading lines. It’s about making sure you still understand what will happen in production.

Matt

15 Jan 2026 — 2 min read

I’m not reviewing code. I’m reviewing a change in system behaviour.

A good rule to use:

The bigger the AI-generated change, the more effort I put into reviewing intent and edges, and not lines of code.

AI helps me move fast.

Review is where I decide what’s safe to ship.

A good second rule:

Keep changes small and incremental whenever possible.

Following are some suggested strategies.

Small change (minimal risk)

Typical size (guide, not a rule)

~1 to 5 files
~10 to 100 lines changed

Typical shape of change

Small diff, few files
Renames, formatting, tidy-ups
No behaviour change intended

PR checklist

What is the one thing this change is meant to do?
Does the diff match that intent?
Were any new branches, defaults or fallbacks added?
Did auth, env vars, network calls or data writes change?
Was any error handling added or loosened?

If it looks mechanical and behaves the same, then suggest approve.

Medium change (some risk)

Typical size (guide, not a rule)

~5 to 20 files
~100 to 400 lines changed

Typical shape of change

Larger diff, several files
Helpers added, logic reshaped
Behaviour should mostly stay the same

PR checklist

What is supposed to change and what must not?
Are permissions or validation wider than before?
Are errors still visible and logged?
Do defaults still make sense?
What happens with empty, invalid or missing input?
What happens if a dependency fails?
If this breaks, how would I notice?

Always run negative tests here.

Large change (high risk)

Typical size (guide, not a rule)

~20+ files
~400+ lines changed

Typical shape of change

Very large diff
Many files
Bulk AI edits or new flows
Direct production impact

PR checklist

Which parts of this change are high-risk (auth, data, config, infra)?
What must never change as a result of this PR?
Did AI “helpfully” widen behaviour or hide failure?
Are any errors being swallowed or turned into defaults?
Are retries, timeouts or fallbacks hiding problems?
Do I still trust the logs and alerts?
What would a 2am failure look like?
Is rollback or staged rollout in place?

If I don’t trust it, then slow down.

Notes on terms used

PR (pull request): A proposed set of code changes reviewed before merging.
Diff: A view of what changed in the code.
System behaviour: What the software actually does in production, especially under failure.
Defaults / fallbacks: Values or paths used when something is missing or fails.
Negative tests: Tests that use bad input or failure scenarios on purpose.
Rollback: Reverting to a previous version if a change causes problems.
Staged rollout: Releasing changes gradually to reduce risk.

If you use a different rule or checklist when reviewing AI-generated changes, I’d be interested to hear it in the comments.

Reviewing AI-Assisted Code Changes

Matt

A good rule to use:

A good second rule:

Small change (minimal risk)

Medium change (some risk)

Large change (high risk)

Notes on terms used

Read more

Join the next working session: What this will not do (on purpose)

Pressure moving upstream

Don’t Reinvent the Wheel, Question the Price Tag Instead

When effort and systems stop being the same thing