What is the self-healing eval loop?

A flagship Workshop feature that lets a coding agent (like Claude Code) read Workshop's traces, write evaluations against your codebase, and then autonomously fix the broken code. The loop: agent runs → Workshop traces it → broken trace surfaces → coding agent reads trace + repo → writes an eval that captures the failure → fixes the bug. It automates the manual log-scrubbing cycle that dominates agent debugging today.

Is Raindrop Workshop free?

Yes. Workshop is MIT-licensed and fully open source — install with a one-line shell command or build from github.com/raindrop-ai/workshop. Raindrop also offers a hosted cloud version with team features, retention, and SSO, but the local Workshop is free with no quotas. Data stays in a local SQLite file unless you opt into the hosted dashboard.

How does Workshop compare to LangSmith and Braintrust?

LangSmith (LangChain) and Braintrust are SaaS observability platforms — data ships to their cloud. Workshop is local-first: traces never leave your machine unless you opt in. Workshop also adds the self-healing eval loop, which LangSmith/Braintrust don't have natively. For privacy-sensitive workloads or local-only debugging, Workshop wins. For production cross-team observability, LangSmith/Braintrust still lead.

Quick Answer

What Is Raindrop Workshop? AI Agent Debugger (May 2026)

Q: What is Raindrop Workshop?

Raindrop Workshop is an open-source (MIT) local debugger and evaluation environment for AI agents, launched May 14, 2026. It streams agent telemetry to a localhost:5899 dashboard, stores all traces in a single SQLite database, and provides replay with edited prompts/models/tools. Compatible with Claude Code, Cursor, Codex, OpenCode, Devin, plus the Vercel AI SDK, OpenAI/Anthropic/LangChain/LlamaIndex/CrewAI.

Published: May 15, 2026

What Is Raindrop Workshop? (May 2026)

Raindrop Workshop is an open-source MIT-licensed local debugger and evaluation environment for AI agents, launched on May 14, 2026. It streams agent telemetry to a localhost dashboard, stores all activity in a single SQLite file, and adds a self-healing eval loop that lets coding agents read traces and autonomously fix broken behavior.

Last verified: May 15, 2026

TL;DR

Field	Detail
Launched	May 14, 2026
License	MIT (open source)
Repo	github.com/raindrop-ai/workshop
Install	One-line shell command
Dashboard	`localhost:5899`
Data store	Single SQLite `.db` file
Languages	TypeScript, Python, Rust, Go
Frameworks	Vercel AI SDK, OpenAI, Anthropic, LangChain, LlamaIndex, CrewAI
Coding agents	Claude Code, Cursor, Codex, OpenCode, Devin

What problem does it solve

Multi-step AI agents fail silently. An agent calls a tool, the tool returns wrong data, the agent continues confidently, and you find out three turns later when the output is garbage. Traditional logs don’t capture the LLM’s reasoning, tool inputs/outputs, or the decision tree. Existing observability platforms ship that data to the cloud, which is a non-starter for some workloads.

Workshop solves this with local-first tracing + replay + self-healing.

What’s in Workshop

1. Local-first tracing

All telemetry streams to a localhost:5899 dashboard in real time.
Every token, tool call, and decision is captured.
Data lives in a single SQLite file — no separate server, no Docker stack.
No outbound network calls unless you opt into the hosted dashboard.

2. Replay with edits

Pick any past run.
Edit the prompt, model, temperature, tool definitions.
Re-run from any span — not just the start.
Compare runs side-by-side.

3. The self-healing eval loop

The flagship feature. A coding agent (Claude Code is the reference integration) can:

Read Workshop’s traces for a failing run.
Write evaluations that capture the failure as a test.
Edit code in your repo to fix the bug.
Re-run until the eval passes.

This turns manual log-scrubbing into an automated debug→fix loop.

4. Broad SDK and framework support

Languages: TypeScript, Python, Rust, Go.
SDKs: Vercel AI SDK, OpenAI, Anthropic, LangChain, LlamaIndex, CrewAI.
Coding agents: Claude Code, Cursor, Codex, OpenCode, Devin.
One-line install or build from source.

How it compares

	Raindrop Workshop	LangSmith	Braintrust	Helicone
License	MIT (OSS)	Proprietary	Proprietary	OSS core
Data location	Local SQLite	Cloud	Cloud	Self-host or cloud
Free tier	Fully free	Limited	Limited	Free tier
Self-healing eval	✅ Native	❌	❌	❌
Replay with edits	✅	✅	✅	🟡
Team/SSO	Hosted addon	✅	✅	✅
Best for	Local debug, privacy	Cross-team prod observability	Eval-heavy workflows	Cost-tracking + analytics

Who Workshop is for

✅ Solo devs and small teams building agent products — local debug, free, fast.

✅ Privacy-sensitive workloads — healthcare, legal, finance — where ship-to-cloud observability is a blocker.

✅ Coding-agent integrators — Workshop’s self-healing loop pairs with Claude Code’s autonomous mode.

🟡 Large orgs with cross-team agent fleets — Workshop’s local model doesn’t scale to 50 engineers without the hosted version. Use LangSmith/Braintrust for that.

❌ Pure prompt-engineering teams — Workshop’s strength is agent trace replay, not prompt evaluation. PromptLayer or Braintrust fits better.

Install in 60 seconds

# One-line install (per Raindrop docs)
curl -fsSL https://workshop.raindrop.ai/install.sh | sh

# Or from source
git clone https://github.com/raindrop-ai/workshop
cd workshop && npm install && npm run dev

# Open dashboard
open http://localhost:5899

Then wire one of the official SDK adapters into your agent code. Examples for Vercel AI SDK, Claude Code hook, and LangChain are in the repo.

Why this matters now

Three things landed in the same week:

Workshop launched (May 14) with self-healing.
Anthropic launched “dreaming” earlier in May — agents reviewing their own past sessions.
Anthropic’s Agent SDK credit shift (June 15) means more developers will run agents headless against subscriptions.

Together, this is the “agents debug themselves now” moment. Workshop is the first OSS tool that makes that loop concrete and free.

Risks and watch-outs

Local-first means no cross-team sharing by default. Plan for an upgrade path to the hosted version if your team grows.
SQLite scale — single-file storage works great up to ~10GB of traces; archive policy needed beyond that.
Self-healing autonomy — letting a coding agent edit your repo from a trace is powerful and dangerous. Use sandbox branches and require PR review.
Brand-new project — the codebase is days old; expect rapid breaking changes.

What to watch next

Workshop 0.2 with multi-agent trace correlation.
Hosted Raindrop dashboard for teams.
Claude Code first-class integration (currently community adapter).
Workshop + dreaming combo — Anthropic’s dreaming feature could feed Workshop traces back to Claude for self-improvement.

Sources: VentureBeat, github.com/raindrop-ai/workshop, Product Hunt, Reddit r/AI_Agents, startuphub.ai — May 14, 2026.