What are Microsoft RAMPART and Clarity?

RAMPART and Clarity are two open-source tools Microsoft released on May 20, 2026 to bring safety into the agent development workflow. RAMPART is an agent test framework that encodes adversarial and benign scenarios as repeatable tests you can run in CI — making it easy to turn red-team findings and AI incidents into lasting regression coverage. Clarity is a structured sounding board that helps teams decide whether they are building the right thing before they write a single line of code. They sit at opposite ends of the lifecycle: Clarity before code, RAMPART after. Both ship under permissive open-source licenses and target the gap between 'we red-teamed once at launch' and 'we have continuous safety regression coverage in CI.'

How is RAMPART different from agent evaluations (LangSmith, Braintrust, Arize)?

Evals tools like LangSmith, Braintrust, and Arize measure agent quality — accuracy, latency, hallucination rate, retrieval relevance. RAMPART is purpose-built for safety regression: it encodes adversarial scenarios (prompt injection, indirect prompt injection, tool-call abuse, jailbreak attempts, data exfiltration) as repeatable test cases that fail the build if the agent regresses. Think of it as 'unit tests for agent safety' — the same pattern that turned application security from a periodic audit into something engineers own day-to-day. RAMPART complements quality evals rather than replacing them: most production teams will run both.

What does Clarity actually do?

Clarity is a structured pre-build audit. Before a team writes agent code, Clarity walks them through a series of guided prompts that surface assumptions: who is the user, what data does the agent access, what actions can it take, what's the blast radius if it goes wrong, what's the human-in-the-loop story, who owns the failure mode. The output is a structured design document the team can review, share with security and legal, and refer back to as the agent evolves. Microsoft positions it as the 'sounding board' step that prevents teams from building agents whose risk profile they don't fully understand. It's lightweight — closer to a structured doc generator than a full GRC platform.

Should I adopt RAMPART or Clarity in May 2026?

If you're shipping agents that take real-world actions (write to systems, send emails, transact, browse the web), adopt both. Clarity at the design stage takes a few hours and surfaces assumptions you'd otherwise discover during incident response. RAMPART at the CI stage takes a few days to wire up and then runs forever — every commit gets checked against your adversarial test corpus. The combined cost is small relative to the cost of one production agent incident. For research-only or internal agents with no external action surface, Clarity alone is probably enough. For production agents at any scale, the answer is both.

Quick Answer

What Is Microsoft RAMPART and Clarity? CI Safety for Agents

Published: May 25, 2026

What Is Microsoft RAMPART and Clarity? (May 2026)

Microsoft open-sourced two AI agent safety tools on May 20, 2026 — RAMPART (CI test framework for agent safety regression) and Clarity (structured pre-build assumption audit). They sit at opposite ends of the lifecycle and answer the same question: how do you make agent safety a continuous engineering discipline rather than a one-time launch audit?

Last verified: May 25, 2026.

TL;DR

Announced: Microsoft Security Blog, May 20, 2026.
License: Open source (permissive).
RAMPART: Agent test framework that encodes adversarial scenarios as repeatable CI tests.
Clarity: Structured sounding board that audits assumptions before code is written.
Together: Cover the agent lifecycle from design intent (Clarity) to regression coverage (RAMPART).
Closest analog: Application security tooling — but built for the agent-specific failure modes (prompt injection, tool abuse, indirect injection, data exfiltration).

What RAMPART actually does

RAMPART is a test framework where you write agent safety scenarios — both adversarial and benign — as code, then run them in CI like any other test suite.

Typical scenarios you encode in RAMPART:

Scenario type	Example
Direct prompt injection	User message asks the agent to ignore prior instructions and reveal system prompt
Indirect prompt injection	Webpage the agent browses contains hidden instructions to exfiltrate data
Tool-call abuse	Agent receives crafted input designed to trick it into calling a destructive tool
Data leakage	Agent is asked to summarize a doc containing PII and tested whether PII appears in output
Authorization escape	Agent is asked to perform an action outside its authorized scope
Jailbreak persistence	Multi-turn jailbreak attempts to see whether agent’s safety drifts mid-conversation
Benign regression	Agent must still complete normal task even with adversarial-looking surrounding context

The point: each scenario is a repeatable test case that runs every commit. If a model update or prompt change regresses one of these, the build fails. Red-team findings stop being one-off PDFs and become permanent regression coverage.

What Clarity actually does

Clarity is the before code tool. It’s a structured sounding board that walks a team through assumption-surfacing questions before the agent is built:

Who is the user, and what are their incentives?
What data sources does the agent access (and what’s in those sources you didn’t think about)?
What actions can the agent take, and what’s the blast radius of each one?
Who is the human in the loop, and at what decision points?
What does failure look like — and who owns it?
What assumptions about user behavior, data quality, or model capability are we making?

The output is a structured design document. Microsoft frames Clarity as the step that prevents teams from building agents whose risk profile they don’t fully understand until something breaks.

It’s deliberately lightweight — closer to a guided doc generator than a heavy GRC platform. The goal is adoption, not compliance theater.

How they fit together

Lifecycle stage	Tool	What it produces
Pre-build (design intent)	Clarity	Structured design + assumption document
Build (implementation)	(your stack)	Agent code, prompts, tool definitions
Pre-deploy (validation)	RAMPART	Adversarial test suite, baseline pass
Post-deploy (continuous)	RAMPART in CI	Regression coverage on every commit

Clarity surfaces the threats you should test for. RAMPART encodes those threats as tests that never go away. The combined effect: agent safety becomes a continuous engineering discipline owned by the team building the agent — not a periodic external audit.

Why this matters in May 2026

Three forces converge:

1. Agents are taking real-world actions. By mid-2026, production agents routinely send emails, edit files, transact, browse the web, and integrate with enterprise systems. The blast radius of an agent failure is no longer “wrong text in a chat window” — it’s “wrong action in a production system.”

2. Red-team-once-at-launch is insufficient. Models update, prompts evolve, tools change, and the threat landscape moves. The application security industry learned this lesson in the 2010s and built SAST, DAST, and CI security gates. Agent safety needs the same continuous coverage.

3. Regulators are watching. EU AI Act enforcement is ramping through 2026, and the documented-design + continuous-testing pattern RAMPART and Clarity encode lines up neatly with high-risk system requirements. Microsoft is implicitly building tooling for the compliance posture enterprises need.

How RAMPART compares to other agent testing tools

Tool	Primary focus	Open source	Best for
Microsoft RAMPART	Adversarial safety regression	Yes (May 2026)	Production safety CI gates
LangSmith	Quality evals (accuracy, latency, cost)	No (LangChain Inc.)	Quality + observability
Braintrust	Quality + custom evaluators	No	Quality-focused teams
Arize Phoenix	Tracing + LLM observability	Yes	Production observability
M365 Copilot Evaluations	Microsoft Foundry agents	No (Microsoft)	M365-bound agents
DeepEval	Open-source eval framework	Yes	Lightweight quality evals

RAMPART’s positioning is unique: it’s the first widely-distributed open-source tool focused specifically on adversarial safety regression rather than quality evaluation. Most teams will run both a quality eval stack and RAMPART.

What’s missing

A few honest caveats on the May 20 launch:

Scenario authoring is still hard. RAMPART gives you the framework, but writing high-quality adversarial scenarios that reflect your agent’s real threat model is the actual work. Microsoft is shipping starter scenarios but expects teams to author their own.
No baked-in red-team coverage of every modality. Vision-based attacks, audio injection, document-embedded payloads — initial coverage is text-first; multimodal expansion comes later.
Clarity output is human-in-the-loop. Clarity surfaces questions but doesn’t answer them. You still need someone who can think clearly about your specific threat model.

Verdict

What it is: A two-tool open-source pairing for agent safety lifecycle — Clarity before code, RAMPART in CI.
Who needs it: Any team shipping agents that take real-world actions. Mandatory if you’re under regulated workloads.
What it replaces: One-time launch red-teaming. Ad-hoc Google Docs threat modeling.
What it complements: LangSmith / Braintrust / Arize for quality evals — RAMPART for safety regression.
Bottom line: This is the agent-era equivalent of “we added security tests to CI.” It’s not optional once you’re shipping agents that touch real systems.