TL;DR
- What: Open-source CLI that gives AI coding agents a real browser to verify their UI work
- Creator: AmElmo on GitHub
- How it works: Records video, captures screenshots, collects console/server errors, bundles everything into a self-contained HTML report
- Works with: Claude Code, Cursor, Codex, Gemini CLI, Windsurf, GitHub Copilot — any agent that can run shell commands
- Install:
npm install -g proofshot && proofshot install - License: Open source on GitHub
- HN traction: 60+ points and 44 comments on Show HN
- Key insight: AI agents write code blind — they can’t see if the UI they built actually looks right. ProofShot closes that feedback loop.
Quick Reference Table
| Detail | Value |
|---|---|
| Repository | AmElmo/proofshot |
| Package | proofshot on npm |
| Language | TypeScript (ESM-only) |
| Browser engine | agent-browser (headless Chromium) |
| Install | npm install -g proofshot |
| Supported agents | Claude Code, Cursor, Codex, Gemini CLI, Windsurf, GitHub Copilot |
| License | Open source |
| Error detection | 10+ languages (Node.js, Python, Ruby, Go, Java, Rust, PHP, C#, Elixir, more) |
The Problem: AI Agents Build UI Blind
Here’s the uncomfortable truth about AI coding agents in 2026: they can write thousands of lines of frontend code, but they have no idea what the result actually looks like.
Claude Code can scaffold an entire React dashboard. Cursor can refactor your component library. Codex can implement a checkout flow. But none of them can open a browser and check whether the login button is actually visible, whether the layout broke on mobile, or whether the form throws a console error when you click submit.
The feedback loop is broken. The agent writes code, you review it manually, you find visual bugs, you go back to the agent, repeat. For UI-heavy work, this manual verification step eats 30-40% of development time.
ProofShot fixes this by giving AI agents what they’ve been missing: a real browser they can drive, record, and report on.
How ProofShot Works
The workflow is deliberately simple — three commands that any AI agent can run:
Step 1: Start a Session
proofshot start --run "npm run dev" --port 3000 --description "Verify checkout flow"
This launches your dev server, opens a headless Chromium browser via agent-browser, and starts recording everything — video, console output, server logs.
Step 2: The Agent Drives the Browser
# See what's on the page
agent-browser snapshot -i
# Navigate to the target page
agent-browser open http://localhost:3000/login
# Fill in a form
agent-browser fill @e2 "[email protected]"
# Click a button
agent-browser click @e5
# Capture a screenshot as proof
agent-browser screenshot ./proofshot-artifacts/step-login.png
The agent interacts with real DOM elements using element references (@e2, @e5) that agent-browser discovers via snapshot. This isn’t simulated — it’s actual browser automation.
Step 3: Stop and Bundle Proof
proofshot stop
This stops recording and generates a timestamped folder in ./proofshot-artifacts/ containing:
| File | What it is |
|---|---|
session.webm | Full video recording of the browser session |
viewer.html | Standalone interactive viewer with scrub bar, timeline, and log tabs |
SUMMARY.md | Markdown report with errors, screenshots, and video |
step-*.png | Screenshots captured at key moments |
session-log.json | Action timeline with timestamps and element data |
server.log | Dev server stdout/stderr |
console-output.log | Browser console output |
The viewer.html is the standout feature — it’s a self-contained HTML file you can open in any browser. You get the video recording with a scrub bar, action markers on the timeline, and tabs for console and server logs with error highlighting synced to the video timestamps.
The Skill System: Zero-Config Agent Integration
The clever part is how ProofShot teaches agents to use it. Running proofshot install auto-detects your AI coding tools and installs a skill file at the user level:
proofshot install # Interactive — picks all detected tools
proofshot install --only claude # Just Claude Code
proofshot install --skip cursor # Everything except Cursor
Here’s where each skill gets installed:
| Agent | Skill Location |
|---|---|
| Claude Code | ~/.claude/skills/proofshot/SKILL.md |
| Cursor | ~/.cursor/rules/proofshot.mdc |
| Codex (OpenAI) | ~/.codex/skills/proofshot/SKILL.md |
| Gemini CLI | Appends to ~/.gemini/GEMINI.md |
| Windsurf | Appends to ~/.codeium/windsurf/memories/global_rules.md |
Once installed, the agent just knows how to use ProofShot. You say “verify this with proofshot” and the agent handles the entire start → test → stop workflow automatically. No per-project configuration needed.
GitHub PR Integration
This is where ProofShot becomes a real workflow tool, not just a demo:
proofshot pr # Auto-detect PR from current branch
proofshot pr 42 # Target a specific PR number
proofshot pr --dry-run # Preview markdown without posting
proofshot pr finds all verification sessions recorded on the current branch, uploads screenshots and video to GitHub, and posts a formatted comment on the PR with inline media. Your reviewer sees exactly what the agent built — video proof, screenshots, and error logs — right in the PR.
Requires GitHub CLI (gh) to be installed and authenticated. If ffmpeg is available, it converts .webm video to .mp4 for better compatibility.
Visual Regression Testing
ProofShot also supports visual diff comparisons:
proofshot diff --baseline ./previous-artifacts
This compares current screenshots against a baseline set, catching visual regressions that code review alone would miss. For teams doing rapid AI-assisted iteration, this is a safety net against layout breakage.
What the Community Is Saying
The Show HN thread (60+ points, 44 comments) shows strong developer interest:
The positive reactions:
- Developers are excited about closing the AI coding feedback loop
- Integration with existing tools (Claude Code, Cursor) means low adoption friction
- The PR upload feature resonated — reviewers want visual proof, not just code diffs
- Potential to reduce manual QA cycles by 30-40% for UI work
The concerns:
- How well can it detect subtle visual issues vs. just functional correctness?
- Performance overhead of running headless Chromium alongside dev server
- Whether the element reference system (
@e2,@e5) is stable enough for complex SPAs
The broader interest:
- Several commenters highlighted potential integration with GitHub Copilot for end-to-end automation
- Discussion around using ProofShot for rapid prototyping verification
- Interest in extending it to mobile viewport testing
Error Detection Across 10+ Languages
ProofShot doesn’t just record video — it actively detects errors from server logs across multiple backend languages: JavaScript/Node.js, Python, Ruby/Rails, Go, Java/Kotlin, Rust, PHP, C#/.NET, Elixir/Phoenix, and more.
The error patterns are defined in src/utils/error-patterns.ts and are extensible. When a server error fires during verification, it shows up in the viewer timeline synced to the exact moment it occurred in the video.
Architecture Under the Hood
ProofShot is built on:
- agent-browser by Vercel Labs — headless Chromium with an element reference system designed for AI agents
- TypeScript (ESM-only) with
tsupfor builds andvitestfor tests - GCD-style session management — each session produces isolated artifacts in timestamped folders
- Zero cloud dependency — everything runs locally, artifacts stay on your machine
The key architectural decision is using agent-browser instead of raw Playwright or Puppeteer. The element reference system (@e2, @e5) gives agents stable handles to interact with DOM elements, which is more reliable than CSS selectors or XPath for AI-driven automation.
Getting Started: Try It in 5 Minutes
# Install globally
npm install -g proofshot
proofshot install
# Clone the repo for sample apps
git clone https://github.com/AmElmo/proofshot.git
cd proofshot
npm install && npm run build && npm link
# Set up sample app
cd test/fixtures/sample-app
npm install
Then open your AI agent in the test/fixtures/sample-app/ directory and prompt:
Verify the sample app with proofshot. Start on the homepage, check the hero section, navigate to the Dashboard and check the metrics, then go to Settings and update the profile name. Screenshot each page.
Or run without an agent:
bash test-proofshot.sh
Check proofshot-artifacts/ for the generated video, screenshots, and report.
Who Should Use This (and Who Shouldn’t)
Use ProofShot if:
- You’re using AI coding agents for UI/frontend work and tired of manually checking every change
- Your team wants visual proof in PR reviews, not just code diffs
- You’re iterating rapidly with AI agents and need a fast verification loop
- You want automated error detection across multiple backend languages
Skip it if:
- You’re not doing UI work (backend-only codebases won’t benefit much)
- You need pixel-perfect visual testing (dedicated tools like Percy or Chromatic are more mature)
- You’re working on mobile-native apps (browser-only for now)
Comparison with Alternatives
| Tool | Focus | AI Agent Integration | Local/Cloud | PR Integration |
|---|---|---|---|---|
| ProofShot | AI agent UI verification | Built-in skill system | Local only | GitHub via proofshot pr |
| Playwright | E2E testing | Manual scripting | Local | Via CI |
| Percy | Visual regression | No AI integration | Cloud | GitHub/GitLab |
| Chromatic | Storybook visual testing | No AI integration | Cloud | GitHub |
| Meticulous | AI-generated E2E tests | Generates tests, doesn’t verify agent work | Cloud | GitHub |
ProofShot occupies a unique niche: it’s not a testing framework, it’s a verification layer specifically for AI coding agents. The skill system and self-contained artifacts make it practical for the “agent builds, human reviews” workflow that most teams actually use.
FAQ
Does ProofShot work with any AI coding agent?
Yes — any agent that can execute shell commands can use ProofShot. The proofshot install command has built-in support for Claude Code, Cursor, Codex, Gemini CLI, Windsurf, and GitHub Copilot, but the CLI itself is agent-agnostic.
Does it require a cloud service or API key?
No. ProofShot is fully local. No cloud dependency, no vendor lock-in. Artifacts stay on your machine. The only external integration is optional GitHub PR uploads via the gh CLI.
How much overhead does it add?
ProofShot runs a headless Chromium instance via agent-browser alongside your dev server. On a modern machine with 16GB+ RAM, the overhead is minimal. The video recording and screenshot capture run asynchronously.
Can I use it for visual regression testing in CI?
The proofshot diff command supports baseline comparisons, but ProofShot is primarily designed for developer-time verification, not CI pipelines. For production visual regression testing, tools like Percy or Chromatic are more battle-tested.
What about mobile viewport testing?
Currently browser-only with desktop viewports. Mobile viewport simulation could be added via Chromium’s device emulation, but it’s not built-in yet.
Is the video recording resource-intensive?
The .webm recording uses efficient browser-native capture. Session videos are typically 5-20MB depending on length. The self-contained viewer.html file adds minimal overhead since it embeds the video reference rather than the video itself.