TL;DR

  • What: Open-source CLI that gives AI coding agents a real browser to verify their UI work
  • Creator: AmElmo on GitHub
  • How it works: Records video, captures screenshots, collects console/server errors, bundles everything into a self-contained HTML report
  • Works with: Claude Code, Cursor, Codex, Gemini CLI, Windsurf, GitHub Copilot — any agent that can run shell commands
  • Install: npm install -g proofshot && proofshot install
  • License: Open source on GitHub
  • HN traction: 60+ points and 44 comments on Show HN
  • Key insight: AI agents write code blind — they can’t see if the UI they built actually looks right. ProofShot closes that feedback loop.

Quick Reference Table

DetailValue
RepositoryAmElmo/proofshot
Packageproofshot on npm
LanguageTypeScript (ESM-only)
Browser engineagent-browser (headless Chromium)
Installnpm install -g proofshot
Supported agentsClaude Code, Cursor, Codex, Gemini CLI, Windsurf, GitHub Copilot
LicenseOpen source
Error detection10+ languages (Node.js, Python, Ruby, Go, Java, Rust, PHP, C#, Elixir, more)

The Problem: AI Agents Build UI Blind

Here’s the uncomfortable truth about AI coding agents in 2026: they can write thousands of lines of frontend code, but they have no idea what the result actually looks like.

Claude Code can scaffold an entire React dashboard. Cursor can refactor your component library. Codex can implement a checkout flow. But none of them can open a browser and check whether the login button is actually visible, whether the layout broke on mobile, or whether the form throws a console error when you click submit.

The feedback loop is broken. The agent writes code, you review it manually, you find visual bugs, you go back to the agent, repeat. For UI-heavy work, this manual verification step eats 30-40% of development time.

ProofShot fixes this by giving AI agents what they’ve been missing: a real browser they can drive, record, and report on.

How ProofShot Works

The workflow is deliberately simple — three commands that any AI agent can run:

Step 1: Start a Session

proofshot start --run "npm run dev" --port 3000 --description "Verify checkout flow"

This launches your dev server, opens a headless Chromium browser via agent-browser, and starts recording everything — video, console output, server logs.

Step 2: The Agent Drives the Browser

# See what's on the page
agent-browser snapshot -i

# Navigate to the target page
agent-browser open http://localhost:3000/login

# Fill in a form
agent-browser fill @e2 "[email protected]"

# Click a button
agent-browser click @e5

# Capture a screenshot as proof
agent-browser screenshot ./proofshot-artifacts/step-login.png

The agent interacts with real DOM elements using element references (@e2, @e5) that agent-browser discovers via snapshot. This isn’t simulated — it’s actual browser automation.

Step 3: Stop and Bundle Proof

proofshot stop

This stops recording and generates a timestamped folder in ./proofshot-artifacts/ containing:

FileWhat it is
session.webmFull video recording of the browser session
viewer.htmlStandalone interactive viewer with scrub bar, timeline, and log tabs
SUMMARY.mdMarkdown report with errors, screenshots, and video
step-*.pngScreenshots captured at key moments
session-log.jsonAction timeline with timestamps and element data
server.logDev server stdout/stderr
console-output.logBrowser console output

The viewer.html is the standout feature — it’s a self-contained HTML file you can open in any browser. You get the video recording with a scrub bar, action markers on the timeline, and tabs for console and server logs with error highlighting synced to the video timestamps.

The Skill System: Zero-Config Agent Integration

The clever part is how ProofShot teaches agents to use it. Running proofshot install auto-detects your AI coding tools and installs a skill file at the user level:

proofshot install              # Interactive — picks all detected tools
proofshot install --only claude # Just Claude Code
proofshot install --skip cursor # Everything except Cursor

Here’s where each skill gets installed:

AgentSkill Location
Claude Code~/.claude/skills/proofshot/SKILL.md
Cursor~/.cursor/rules/proofshot.mdc
Codex (OpenAI)~/.codex/skills/proofshot/SKILL.md
Gemini CLIAppends to ~/.gemini/GEMINI.md
WindsurfAppends to ~/.codeium/windsurf/memories/global_rules.md

Once installed, the agent just knows how to use ProofShot. You say “verify this with proofshot” and the agent handles the entire start → test → stop workflow automatically. No per-project configuration needed.

GitHub PR Integration

This is where ProofShot becomes a real workflow tool, not just a demo:

proofshot pr        # Auto-detect PR from current branch
proofshot pr 42     # Target a specific PR number
proofshot pr --dry-run  # Preview markdown without posting

proofshot pr finds all verification sessions recorded on the current branch, uploads screenshots and video to GitHub, and posts a formatted comment on the PR with inline media. Your reviewer sees exactly what the agent built — video proof, screenshots, and error logs — right in the PR.

Requires GitHub CLI (gh) to be installed and authenticated. If ffmpeg is available, it converts .webm video to .mp4 for better compatibility.

Visual Regression Testing

ProofShot also supports visual diff comparisons:

proofshot diff --baseline ./previous-artifacts

This compares current screenshots against a baseline set, catching visual regressions that code review alone would miss. For teams doing rapid AI-assisted iteration, this is a safety net against layout breakage.

What the Community Is Saying

The Show HN thread (60+ points, 44 comments) shows strong developer interest:

The positive reactions:

  • Developers are excited about closing the AI coding feedback loop
  • Integration with existing tools (Claude Code, Cursor) means low adoption friction
  • The PR upload feature resonated — reviewers want visual proof, not just code diffs
  • Potential to reduce manual QA cycles by 30-40% for UI work

The concerns:

  • How well can it detect subtle visual issues vs. just functional correctness?
  • Performance overhead of running headless Chromium alongside dev server
  • Whether the element reference system (@e2, @e5) is stable enough for complex SPAs

The broader interest:

  • Several commenters highlighted potential integration with GitHub Copilot for end-to-end automation
  • Discussion around using ProofShot for rapid prototyping verification
  • Interest in extending it to mobile viewport testing

Error Detection Across 10+ Languages

ProofShot doesn’t just record video — it actively detects errors from server logs across multiple backend languages: JavaScript/Node.js, Python, Ruby/Rails, Go, Java/Kotlin, Rust, PHP, C#/.NET, Elixir/Phoenix, and more.

The error patterns are defined in src/utils/error-patterns.ts and are extensible. When a server error fires during verification, it shows up in the viewer timeline synced to the exact moment it occurred in the video.

Architecture Under the Hood

ProofShot is built on:

  • agent-browser by Vercel Labs — headless Chromium with an element reference system designed for AI agents
  • TypeScript (ESM-only) with tsup for builds and vitest for tests
  • GCD-style session management — each session produces isolated artifacts in timestamped folders
  • Zero cloud dependency — everything runs locally, artifacts stay on your machine

The key architectural decision is using agent-browser instead of raw Playwright or Puppeteer. The element reference system (@e2, @e5) gives agents stable handles to interact with DOM elements, which is more reliable than CSS selectors or XPath for AI-driven automation.

Getting Started: Try It in 5 Minutes

# Install globally
npm install -g proofshot
proofshot install

# Clone the repo for sample apps
git clone https://github.com/AmElmo/proofshot.git
cd proofshot
npm install && npm run build && npm link

# Set up sample app
cd test/fixtures/sample-app
npm install

Then open your AI agent in the test/fixtures/sample-app/ directory and prompt:

Verify the sample app with proofshot. Start on the homepage, check the hero section, navigate to the Dashboard and check the metrics, then go to Settings and update the profile name. Screenshot each page.

Or run without an agent:

bash test-proofshot.sh

Check proofshot-artifacts/ for the generated video, screenshots, and report.

Who Should Use This (and Who Shouldn’t)

Use ProofShot if:

  • You’re using AI coding agents for UI/frontend work and tired of manually checking every change
  • Your team wants visual proof in PR reviews, not just code diffs
  • You’re iterating rapidly with AI agents and need a fast verification loop
  • You want automated error detection across multiple backend languages

Skip it if:

  • You’re not doing UI work (backend-only codebases won’t benefit much)
  • You need pixel-perfect visual testing (dedicated tools like Percy or Chromatic are more mature)
  • You’re working on mobile-native apps (browser-only for now)

Comparison with Alternatives

ToolFocusAI Agent IntegrationLocal/CloudPR Integration
ProofShotAI agent UI verificationBuilt-in skill systemLocal onlyGitHub via proofshot pr
PlaywrightE2E testingManual scriptingLocalVia CI
PercyVisual regressionNo AI integrationCloudGitHub/GitLab
ChromaticStorybook visual testingNo AI integrationCloudGitHub
MeticulousAI-generated E2E testsGenerates tests, doesn’t verify agent workCloudGitHub

ProofShot occupies a unique niche: it’s not a testing framework, it’s a verification layer specifically for AI coding agents. The skill system and self-contained artifacts make it practical for the “agent builds, human reviews” workflow that most teams actually use.

FAQ

Does ProofShot work with any AI coding agent?

Yes — any agent that can execute shell commands can use ProofShot. The proofshot install command has built-in support for Claude Code, Cursor, Codex, Gemini CLI, Windsurf, and GitHub Copilot, but the CLI itself is agent-agnostic.

Does it require a cloud service or API key?

No. ProofShot is fully local. No cloud dependency, no vendor lock-in. Artifacts stay on your machine. The only external integration is optional GitHub PR uploads via the gh CLI.

How much overhead does it add?

ProofShot runs a headless Chromium instance via agent-browser alongside your dev server. On a modern machine with 16GB+ RAM, the overhead is minimal. The video recording and screenshot capture run asynchronously.

Can I use it for visual regression testing in CI?

The proofshot diff command supports baseline comparisons, but ProofShot is primarily designed for developer-time verification, not CI pipelines. For production visual regression testing, tools like Percy or Chromatic are more battle-tested.

What about mobile viewport testing?

Currently browser-only with desktop viewports. Mobile viewport simulation could be added via Chromium’s device emulation, but it’s not built-in yet.

Is the video recording resource-intensive?

The .webm recording uses efficient browser-native capture. Session videos are typically 5-20MB depending on length. The self-contained viewer.html file adds minimal overhead since it embeds the video reference rather than the video itself.