Build log - Pentesting

July 20, 2026

Teaching a black-box pentester to look inside the box

For its whole (short) life, my scanner has worked from the outside. It attacks the running application the way a real attacker would, with no access to the source, and reports what it can prove by observation alone. That's what black-box means, and it's the honest way to test something: you only claim what you could actually make happen from the outside.

Security

June 28, 2026

I added an AI agent whose only job is to prove a finding is wrong

The most useful thing I built in the last two weeks is a step that exists to disagree with the rest of the scanner.It's an LLM pass I call the skeptic. Its entire job is to take a finding the scanner is about to report and try to prove it's a false positive. If it succeeds, the finding gets dropped. If it can't, the finding stays. The interesting question isn't what it does. It's why it has to be a separate thing at all, instead of the scanner just checking its own work.

Security

June 8, 2026

Why I'm democratizing security testing

In the past 2 weeks I didn't ship a single new agent or detection capability. The scanner barely moved. Instead I built almost everything that wraps it: the product's website, intake form, Stripe checkout, domain authorization flows and the AWS infrastructure to host all of it.

Security

June 1, 2026

What it takes to make "we found nothing" mean something

Last week I delivered a production scan to a customer. Authed scan against their live SaaS, full sweep across what the agents can do today.

Security

May 25, 2026

I reverted a week's work because it only worked on select targets

Last week I kept catching myself building the same mistake in different costumes. Each time the work looked like progress. Each time, the honest answer to "is this actually good?" was "no, it just looks good on the target I built it against."

Security

May 18, 2026

My AI agents were just burning tokens...

I ran a full scan against a customer last week. Real authed scan, real Django app, real money on the clock. The scan produced 6 validated findings, which is a fine result for a first pass on a new target. Then I broke down where the cost actually went.

Security

May 11, 2026

Most security tools skip the step real pentesters do first

If you watch a human pentester start an engagement, the first day usually doesn't involve any attacks.It involves a notebook, or a whiteboard, or a text file, and a lot of clicking around.

Security

May 4, 2026

The accidental continuous pentest (week 3)

Two weeks ago I posted week 1 of this build log and asked anyone running a small dev team to tell me how they were handling pentest pressure from their enterprise customers. Around 5 people DM'd.

Security

April 28, 2026

Zero findings on a known-vulnerable app (week 2)

This week I ran the agent against Juice Shop, a deliberately-vulnerable Angular app the security industry uses as a test bed. It's so well-known it's borderline a meme. Dozens of known issues across every category in the OWASP Top 10. The agents found zero.

Security

April 25, 2026

Building a $500 pentest tool for small dev teams (week 1)

A client I had been doing cloud architecture for came to me a while ago with a challenge I keep seeing: Their biggest customer was pressing them on security. Questionnaires, evidence, a pentest report. At the same time, their WAF was lighting up with attackers actively probing for a way in.