ChatGPT Deep Research vs Gemini vs Perplexity: WARP Safety 2026
ChatGPT Deep Research vs Gemini vs Perplexity: WARP Safety 2026
After Cornell Tech’s WARP attack disclosure in June 2026, AI search safety is a measurable thing. The headline metric: user-generated content (UGC) citation rate. ChatGPT Deep Research sits at 0.4%. Gemini at ~12%. Perplexity and Brave Leo haven’t been independently measured. Here is the full safety comparison.
Last verified: June 20, 2026. UGC citation rates from Cornell Tech WARP paper measurements.
TL;DR
- ChatGPT Deep Research: 0.4% UGC citation rate. Lowest WARP exposure.
- Gemini Deep Research: ~12% UGC citation rate. Higher WARP exposure.
- Perplexity: Not independently measured. Anecdotally moderate-to-high UGC reliance.
- Brave Leo: Not independently measured. Brave Search weights authoritative domains.
- Open-source agents (STORM, Co-STORM, OmniThink): Highest susceptibility in the Cornell paper.
- Practical advice: For high-stakes queries, use ChatGPT Deep Research as primary and cross-check.
Direct comparison
| Dimension | ChatGPT Deep Research | Gemini Deep Research | Perplexity Pro | Brave Leo |
|---|---|---|---|---|
| UGC citation rate (measured) | 0.4% | ~12% | Not published | Not published |
| Source authority weighting | Strong (inferred from rate) | Weaker (inferred from rate) | Unknown | Strong (Brave Search) |
| Provenance UI | Yes (citations + Show sources) | Yes (citations) | Yes (citations) | Yes (citations) |
| Recommendation-class safeguards | Stricter | Looser | Loose | Loose |
| Cost | ChatGPT Plus $20/mo, Pro $200/mo | Gemini Advanced $20/mo | Perplexity Pro $20/mo | Free with Brave browser |
| Best for | High-stakes research | General research with caveats | Exploration | Privacy-focused search |
| WARP exposure (relative) | Low | Moderate | Moderate (assumed) | Low-moderate (assumed) |
Why the UGC citation rate matters
The Cornell Tech researchers’ insight: WARP works because 17-23% of pages retrieved by AI deep-research agents come from UGC sites that any user can edit. The lower an AI tool’s UGC citation rate, the smaller its attack surface.
OpenAI’s 0.4% is a striking number. It means ChatGPT Deep Research has aggressive source authority weighting in place — most of its citations come from primary sources, government, academic, and major publisher domains. This isn’t an accident; it’s a deliberate engineering choice.
Gemini’s ~12% suggests Google has prioritized recall (find more sources, surface community knowledge) over precision (only surface trusted sources). This is consistent with Google’s historical search philosophy — “the open web is the source” — but it’s a real WARP exposure in 2026.
ChatGPT Deep Research safety profile
- UGC citation rate: 0.4% (Cornell Tech measurement).
- Defenses inferred: Strong source authority weighting; likely UGC site rate-limiting; probably domain deduplication.
- Remaining exposure: A determined attacker who plants poison on a major publisher’s comment section or compromised Wikipedia paragraph could still influence ChatGPT Deep Research output. The 0.4% is not zero.
- Best for: High-stakes queries where source authority matters more than community insights.
- Weakness: May miss legitimate community knowledge (Reddit threads that genuinely solve niche problems).
Gemini Deep Research safety profile
- UGC citation rate: ~12% (Cornell Tech measurement).
- Defenses inferred: Less aggressive source authority weighting; broader retrieval philosophy.
- Remaining exposure: ~12% of citations come from UGC. The WARP success rate in the Cornell paper was 38-51% on a single page and 62% multi-thread. Combined with Gemini’s UGC rate, this is the highest measured exposure of any major commercial tool.
- Best for: Exploration, community-knowledge queries, niche topics where Reddit/Quora have legitimate value.
- Weakness: Higher WARP exposure for recommendation-class queries.
Perplexity Pro safety profile
- UGC citation rate: Not independently published.
- Anecdotal observation: Perplexity cites Reddit and Wikipedia frequently, suggesting moderate-to-high UGC reliance similar to Gemini, possibly higher in some categories.
- Defenses: Perplexity has provenance UI (visible citations), source-type indicators, and academic source surfacing in Pro mode.
- Remaining exposure: Until Perplexity publishes a UGC citation rate, assume moderate WARP exposure.
- Best for: Citations-first research where you’ll click into sources anyway.
- Weakness: No published source-authority engineering disclosures.
Brave Leo safety profile
- UGC citation rate: Not independently published.
- Underlying retrieval: Brave Search, which has historically weighted authoritative domains and de-emphasized SEO spam.
- Inferred defenses: Source-authority weighting via Brave Search; private retrieval pipeline.
- Remaining exposure: Moderate. Brave Search’s authority weighting helps but Brave Leo is newer and less tested than ChatGPT Deep Research or Gemini.
- Best for: Privacy-focused research, users already in the Brave ecosystem.
- Weakness: Smaller user base means fewer real-world attacks tested.
How to choose for different query types
High-stakes queries (medical, financial, legal, major purchases)
- Use ChatGPT Deep Research as primary.
- Manually cross-check any unfamiliar name (drug, fund, lawyer, contractor, product) against a second independent source.
- For medical queries, prefer MedlinePlus, NIH, peer-reviewed sources.
- For financial queries, prefer SEC filings, primary financial sources.
- For legal queries, prefer Cornell Legal Information Institute, primary case law.
General research and exploration
- Any major tool is fine, but treat recommendations as leads.
- Gemini and Perplexity may surface niche community knowledge that ChatGPT Deep Research filters out — sometimes useful, sometimes a WARP vector.
- For “best X for Y” queries, run the same query across two tools and look for consensus.
Developer / coding research
- Prefer tools that surface primary documentation (vendor docs, GitHub, RFCs).
- ChatGPT Deep Research and Claude (when used for research) both prioritize primary sources for technical queries.
- Cross-check version-specific information against official changelogs.
Niche community knowledge (cooking, hobbies, regional specifics)
- Gemini and Perplexity’s higher UGC rate is sometimes a feature here.
- For low-stakes queries, the WARP exposure is acceptable.
- Still cross-check business names, product recommendations.
The competitive dynamic going forward
Expect three movements in H2 2026:
- Vendors publish UGC citation rates. OpenAI’s 0.4% is already a marketing advantage. Gemini will likely respond either by reducing its rate or by adding provenance UIs that contextualize UGC citations.
- Source-authority APIs emerge. Third-party services will offer “trusted source” allowlists for RAG developers.
- Regulators get specific. EU AI Act compliance may explicitly require recommendation systems to disclose source provenance. The WARP class is now public; it’s hard for regulators to ignore.
For users in 2026, the practical answer is: pick the right tool for the query class, cross-check unfamiliar names, and don’t trust AI recommendations on high-stakes decisions without independent verification.
Sources
- Cornell Tech: Tingwei Zhang, Harold Triedman, Vitaly Shmatikov — WARP paper (June 2026)
- OpenAI Deep Research UGC citation rate measurement (0.4%) — Cornell Tech
- Gemini Deep Research UGC citation rate measurement (~12%) — Cornell Tech
- Tom’s Guide: “A 13-word Reddit comment can trick AI search into recommending scams”
- NeuralBuddies: AI News Recap, June 19, 2026
- Perplexity, Brave Leo product pages
Published June 20, 2026 by andrew.ooo. See deep dives on What is the WARP attack, How to protect AI agents from WARP, and SearchLeak vs WARP vs prompt injection.