Hermes Agent Review: Nous Research's Self-Improving AI

TL;DR

Hermes Agent is Nous Research’s open-source bid to build the agent that actually learns from you — not in a metaphorical “RAG over chat history” way, but with a built-in loop that mints new skills after complex tasks, polishes them while you use them, and keeps a deepening user model across sessions. It runs on a $5 VPS, talks to you over Telegram / Discord / Slack / WhatsApp / Signal, and you switch LLMs with one command — no code changes, no lock-in.

The repo went 0 → 126,000 GitHub stars in nine months, currently sitting at 126K stars, 18.8K forks, 7.3K open issues (as of April 30, 2026). It’s MIT-licensed, written in Python, and is one of the most-watched releases of the year alongside OpenClaw.

The 30-second sales pitch:

A real terminal UI — multiline editing, slash-command autocomplete, conversation history, interrupt-and-redirect, streaming tool output
Lives where you do — Telegram, Discord, Slack, WhatsApp, Signal, Email, CLI from a single gateway process
Closed learning loop — agent-curated memory, autonomous skill creation, skill self-improvement during use, FTS5 cross-session search, Honcho dialectic user modeling
Bring any model — Nous Portal, OpenRouter (200+ models), NVIDIA NIM, Xiaomi MiMo, z.ai/GLM, Kimi, MiniMax, Hugging Face, OpenAI, or your own endpoint via hermes model
Six terminal backends — local, Docker, SSH, Daytona, Singularity, Modal (Daytona/Modal hibernate when idle)
Built-in cron — natural-language scheduled tasks delivered to any platform
Compatible with agentskills.io — the open skill standard the broader ecosystem is converging on
Research-ready — batch trajectory generation, Atropos RL environments, trajectory compression for training the next generation of tool-calling models

If you’ve used OpenClaw and felt it was 90% there but missing the learning piece, this fills the gap. It even ships hermes claw migrate to import your soul, memories, skills, and API keys.

Why “Self-Improving” Actually Means Something Here

Most projects branded “self-improving” mean one of two things: a vector store of past chats (RAG with extra branding) or reflection prompts appended to the next system message. Hermes does both, but the part that earns the label is the skill autogeneration loop:

After a complex task succeeds, the agent is nudged to write a new skill — a small Markdown file with procedural knowledge.
Skills live in ~/.hermes/skills/. Plain text. Read, edit, version-control, share.
Mid-task, the agent gets nudges to revise existing skills when it discovers a better approach.
Skills auto-load as slash commands (/skill-name) and surface in the picker when their description matches the current task.

An optional companion repo, hermes-agent-self-evolution, uses DSPy + GEPA (ICLR 2026 Oral) to read execution traces, figure out why skills failed, and propose targeted improvements.

This is closer in spirit to Voyager (the Minecraft agent that wrote its own skill library) than to a typical LangChain pipeline. Third-party benchmarks claim self-created skills cut research-task time by ~40% versus a fresh agent instance — a number I’d treat with healthy skepticism but which matches anecdotal reports on r/LocalLLaMA.

What’s Actually in the Box

Hermes is a single Python package that does a lot. The headline pieces:

1. The CLI

hermes              # Interactive CLI
hermes model        # Pick your LLM provider and model
hermes gateway      # Start the messaging gateway
hermes setup        # Full setup wizard
hermes claw migrate # Migrate from OpenClaw
hermes doctor       # Diagnose issues

The TUI is a serious piece of work. Multiline editing with proper paste handling, slash-command autocomplete, scrollback history, Ctrl+C to interrupt the model mid-stream and redirect the turn (not kill it), and streaming tool output so you watch bash stdout in real time.

2. The Messaging Gateway

This is where Hermes diverges from every other CLI agent. One command:

hermes gateway setup    # interactive: pick platforms, paste tokens
hermes gateway start    # daemon — runs Telegram, Discord, Slack, etc. concurrently

Now you can text your agent from your phone. It speaks Telegram, Discord, Slack, WhatsApp, Signal, and Email through one process, with cross-platform conversation continuity — start in Telegram, continue in Slack. Voice memos are auto-transcribed (Whisper or your provider of choice). In practice I can be on a train, send “remind me to ship that PR before 3pm and check the prod error rate at 2:30,” and the agent does both.

3. The Skills System

A skill is a Markdown file with optional Python or shell hooks. Roughly:

---
name: deploy-andrew-ooo
description: Build the Astro site, rsync to Hetzner, run the post-deploy health check.
---

# Deploy andrew.ooo

1. Run `npm run build` and verify exit code 0.
2. `rsync -az ./dist/ [email protected]:/var/www/andrew-ooo/`
3. `node scripts/health-check.js` — fail if any 5xx.
4. If anything fails, open an issue in the GitHub repo with the log tail.

The agent loads this when a task description matches description, follows the steps, and — critically — can edit the skill itself if it learns a new wrinkle (e.g., “the deploy occasionally fails because the build cache is stale; clear it first”). Skills are interoperable with the agentskills.io open standard, which means anything you write here can run in OpenClaw, NemoClaw, and a growing list of other compatible runtimes.

4. Sub-agents and RPC Tools

Hermes can spawn isolated sub-agents with their own context. The clever bit is the RPC layer: write a Python script that calls the parent agent’s tools (bash, web fetch, file I/O), and the script runs as a single agent turn — collapsing a 12-step pipeline into zero context cost.

# scripts/triage_inbox.py — runs as one tool call
from hermes.rpc import tool

emails = tool.gmail.search("is:unread newer_than:1d")
for e in emails:
    if "invoice" in e.subject.lower():
        tool.bash.run(f"echo {e.id} >> /tmp/invoices.txt")
    elif "production alert" in e.subject.lower():
        tool.telegram.send(f"🚨 {e.subject}")
print(f"Triaged {len(emails)} emails")

5. Six Terminal Backends

You can run the actual tool execution in:

Local — your machine (fast, dangerous).
Docker — sandboxed container, default for non-trusted tasks.
SSH — execute on a remote box.
Daytona — serverless dev environments that hibernate when idle and wake on demand. Costs nearly nothing between sessions.
Singularity — for HPC clusters and academic environments.
Modal — serverless Python containers, also hibernating.

The Daytona/Modal backends are the killer feature for hobbyist cost-control. Your agent sleeps for free, wakes when you message it, runs for 8 seconds, sleeps again. I’ve been running Hermes on Modal for two weeks at ~$1.40 in compute for ~120 conversations.

Quick Install

The official one-liner works on Linux, macOS, WSL2, and Termux:

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
source ~/.zshrc      # or ~/.bashrc
hermes               # start chatting

The installer handles uv setup, virtualenv creation, the .[all] extras (or .[termux] on Android), and symlinks the hermes binary into ~/.local/bin.

If you’d rather see what it’s doing first, the manual path is just three lines:

curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv venv --python 3.11
uv pip install hermes-agent[all]

Native Windows is not supported — use WSL2.

A Real Conversation

A real session from earlier this week, lightly trimmed:

$ hermes
Hermes Agent ☤ — connected to openrouter:anthropic/claude-sonnet-5

> read the latest 5 posts in src/content/posts/, summarize topics, and
  draft a Reddit comment that links 3 of them without being spammy.

[tool: bash] ls -t src/content/posts/ | head -5
[tool: read x5] reading 5 files...

Topics: computer-use agents, autonomous ML engineering, code-first agent libs.
All three connect through one thread: agents that *do* things instead of
just chatting. Here's a Reddit-friendly comment...

[💡 nudge] Want me to save this triage pattern as a skill?

> yes
[skill saved] ~/.hermes/skills/blog-comment-triage/SKILL.md

That last line is the skill loop in action. Next time I ask for something similar, the skill auto-loads and I skip the planning step.

Honest Limitations

Three weeks of daily driving, here are the rough edges:

7,300+ open issues. Mostly feature requests and minor bugs, but the volume tells you both how active the project is and how many things still don’t quite work.
Skill quality is hit-or-miss. Sometimes overly-specific skills (“deploy at 9:32am Tuesdays”) need manual cleanup.
Memory bloat. After ~200 conversations my SQLite store was 280MB. There’s /compress, but no automatic compaction yet.
Provider rate limits bite. Switching from openai/gpt-5 to a free OpenRouter model mid-conversation can silently truncate context. Run /usage first.
Native Windows still not supported. WSL2 works.
Voice mode is half-baked. Whisper transcription works; outbound TTS is flaky on Telegram voice notes.
The gateway needs a long-lived host. Hibernation helps the terminal backend, but the gateway itself must stay online. A $5 VPS does it.

None are dealbreakers, but expect “promising tool with a fast-moving roadmap” rather than “polished product.”

How It Compares

Project	Stars	Style	Killer Feature	Where Hermes Wins
Hermes Agent	126K	Conversational + Messaging	Self-improving skills, multi-platform gateway	This whole stack in one package
OpenClaw	188K	Conversational	Sandboxed coding agent, MCP-first	OpenClaw is more polished; Hermes adds the learning loop
smolagents	24K	Code-as-action	~1000-line agent loop, library-first	Hermes is an end-user agent; smolagents is a building block
Claude Code	(Anthropic)	Coding	Best-in-class coding	Hermes is multi-purpose, model-agnostic
AutoGen	38K	Multi-agent framework	Group chat patterns	Hermes is for end users, not framework authors

Hermes is closest in shape to OpenClaw — both are “personal agent that lives on your phone via Telegram and on your laptop via CLI.” OpenClaw is more polished and has a larger plugin ecosystem; Hermes adds the autonomous skill-creation loop and the serverless backends. They are interoperable — OpenClaw skills run in Hermes via the agentskills.io standard, and hermes claw migrate will pull in your existing OpenClaw setup wholesale.

Community Reactions

One of the fastest 0-to-100K star runs of 2026, ranking just behind OpenClaw on weekly trending lists.
An r/SideProject post showed someone porting the entire Hermes runtime into a native WinUI 3 desktop app — same tool signatures, permission model, and skills format. That kind of fork-and-build response only happens when the API is well-designed.
Independent reviews (TokenMix Blog, Medium) consistently flag the 40% task-time reduction from self-created skills as the most reproducible benefit — biggest on repetitive tasks (research, triage, deploys), smallest on novel one-shots.
Most common Discord complaint matches mine: skill quality is uneven and the optimizer (hermes-agent-self-evolution) should be in-tree.

Where I’d Reach For It

A personal ops bot. Cron summaries, automated triage, “ping me when X happens.” The Telegram gateway makes this real — you talk to it from your phone instead of forgetting it exists.
A research assistant that compounds. It gets better at your workflow over weeks because the skills accrue.
A team-shared deployment agent. Run it on a VPS, give your team Slack access. The skill files become a shared playbook.
An OpenClaw replacement if you specifically want the learning loop.

I’d not pick Hermes if you need a coding agent for hands-off PRs (Claude Code and OpenClaw are still ahead), GUI/computer-use agents (trycua/cua is the lane), or you’re building a multi-agent framework (use AutoGen or smolagents).

FAQ

Is Hermes Agent free to use?

The Hermes runtime is MIT-licensed and free. You pay for LLM inference — but Hermes works with free and self-hosted models (Ollama, LocalAI, Hugging Face Inference, Xiaomi MiMo’s free tier), so the whole stack can run at $0/month with a GPU or VPS. Most users land on OpenRouter at $5–10/month.

Does it work without an internet connection?

Partially. The runtime, skills, terminal backends (local/Docker), and most tools work offline. You need internet for: cloud LLMs, web fetches, the messaging gateway, and any cloud-based terminal backends (Daytona, Modal, SSH). Pair it with Ollama and a local model and you can run the whole agent loop on a plane.

How is this different from OpenClaw?

OpenClaw is the more mature personal-agent project — bigger plugin ecosystem, more polished CLI, larger user base. Hermes Agent’s differentiation is the autonomous skill-creation loop and serverless terminal backends. They share the agentskills.io standard, and hermes claw migrate lets you import your OpenClaw setup, so you can run both side-by-side. If you want stability, stick with OpenClaw; if you want the learning experiments, run Hermes.

Can I use it with local models like Ollama?

Yes. hermes model lists OpenAI, Anthropic, OpenRouter, Ollama, Hugging Face Inference, NVIDIA NIM (Nemotron), Xiaomi MiMo, z.ai/GLM, Kimi/Moonshot, MiniMax, and “custom endpoint” — point it at any OpenAI-compatible URL. With a 32B local model on a 24GB GPU, the skills loop works fine; with a 7B model, expect the skill quality to drop noticeably.

How safe is letting it run bash commands?

By default Hermes uses an approval prompt for any command not on the allowlist, and the Docker terminal backend is the recommended sandbox for unfamiliar tasks. There’s also a permission system (hermes config set permissions.bash) where you can scope command patterns. Don’t run Hermes with --yolo-style auto-approval against your home directory unless the terminal backend is a Docker container or a hibernating Modal sandbox. Treat it like any other LLM-driven shell tool.

Is there a hosted version?

Not officially. Nous Research operates Nous Portal for inference, but the agent runtime itself is BYO-host. The closest thing to “hosted Hermes” is running it on Modal — that’s effectively a managed serverless deployment with hibernate-on-idle pricing.

What’s the migration path from OpenClaw?

hermes claw migrate is built-in. It auto-detects ~/.openclaw/ and imports your SOUL.md, memories (MEMORY.md, USER.md), skills (into ~/.hermes/skills/openclaw-imports/), command allowlist, messaging settings, allowlisted API keys, TTS assets, and workspace AGENTS.md. Run with --dry-run first to preview, then --overwrite if you want to replace existing files. There’s an interactive openclaw-migration skill that walks through it conversationally.

Bottom Line

Hermes Agent is the most interesting personal-agent release of 2026 — not because it does something nobody else does, but because it bundles the learning loop, the messaging gateway, and the serverless backends in one package that genuinely works. Rough at the edges, skill quality varies, 7,300 open issues, doc gaps.

But it’s the first time I’ve actually seen “self-improving agent” produce a measurable, reproducible benefit on real workflows, and it costs nothing to try. If you’re an OpenClaw user, give the migration script ten minutes. If you’ve been waiting for a personal AI you can text from your phone that gets better at your particular weirdness over time, start here.

Repo: github.com/NousResearch/hermes-agent Docs: hermes-agent.nousresearch.com/docs License: MIT.