Aeon Review: Autonomous AI Agent on GitHub Actions

TL;DR

aaronjmars/aeon is a new open-source agent framework with a simple bet: most of the AI work you actually want done — morning briefs, PR reviews, market monitoring, security scans, research digests — doesn’t need you in the loop. It just needs to happen. Aeon launched as a Show HN on May 15, 2026 and uses GitHub Actions as its runtime, so there’s nothing to host, no daemons, no Docker, no VPS.

The headline:

117 pre-built skills across research, dev tooling, crypto/markets, social, productivity, and meta-agent self-management
Runs on GitHub Actions cron — fork the repo, add an API key, toggle skills in aeon.yml, push
Self-healing loop: every skill output is scored 1–5 by Claude Haiku; 3 consecutive failures auto-fires skill-repair
Persistent memory in memory/*.json and memory/MEMORY.md, committed back to the repo on every run
Reactive triggers in addition to cron — skills can fire on conditions
Spawn fleets of specialized instances via spawn-instance and fleet-control
Works with Claude Pro/Max OAuth (included in plan) or ANTHROPIC_API_KEY (pay per token) — Bankr LLM gateway cuts Opus ~67%

The pitch is “configure once, forget forever.” After two days running it on a fork, that’s mostly accurate — with caveats about token spend and notification volume that we’ll get to.

What Aeon actually is

Most agent frameworks today — Claude Code, Codex, Cursor, OpenClaw, Hermes — are interactive tools. You open a TUI or IDE, type a task, approve tool calls, review diffs. Aeon flips that: it’s an unattended scheduler for the class of tasks (morning briefs, market monitoring, PR reviews, research digests, security scans) where you just want the work done while you’re not there.

The architecture is delightfully boring:

You fork the repo
  ↓
GitHub Actions runs messages.yml every 5 minutes
  ↓
Scheduler checks aeon.yml for skills due to run
  ↓
For each matching skill:
    - Spin up Claude Code in a runner
    - Read SKILL.md prompt
    - Execute (with internet, git, gh CLI, MCP servers)
    - Score output 1–5 via Haiku
    - Commit memory + outputs back to repo
    - Notify Telegram/Discord/Slack if relevant

That’s it. There’s no Aeon server, no SaaS dashboard, no “agent cloud.” The dashboard is a local Next.js app you run with ./aeon to edit YAML and push to GitHub. All compute happens in GitHub Actions runners, which means public forks get unlimited free minutes.

The 117 skills, grouped

This is where Aeon earns the “framework” label. Each skill is a SKILL.md prompt file in skills/<name>/, independently schedulable, chainable, and (importantly) disabled by default. Categories:

Category	Count	Example skills
Research & Content	19	`deep-research`, `hacker-news-digest`, `paper-digest`, `reddit-digest`, `huggingface-trending`, `rss-digest`, `technical-explainer`
Dev & Code	32	`pr-review`, `code-health`, `github-trending`, `issue-triage`, `vuln-scanner`, `workflow-security-audit`, `auto-merge`, `repo-pulse`
Crypto & Markets	19	`defi-monitor`, `monitor-polymarket`, `monitor-kalshi`, `token-alert`, `narrative-tracker`, `unlock-monitor`, `price-threshold-alert`
Social & Writing	12	`write-tweet`, `thread-formatter`, `farcaster-digest`, `show-hn-draft`, `syndicate-article`, `tweet-roundup`
Productivity	14	`morning-brief`, `evening-recap`, `daily-routine`, `goal-tracker`, `weekly-review`, `idea-capture`, `startup-idea`
Meta / Agent	21	`heartbeat`, `skill-health`, `skill-repair`, `self-improve`, `skill-evals`, `cost-report`, `skill-leaderboard`

The crypto/markets category is heavier than I expected — there’s a clear bias toward “monitor things and tell me when they move.” But the dev tooling section (PR review, vuln scan, workflow security audit, GitHub trending) is genuinely useful for any maintainer, and the productivity skills are model-agnostic.

The meta skills are the interesting bit. skill-health audits quality scores, skill-evals runs assertion-based tests against skill outputs to catch regressions, skill-repair diagnoses and patches failing skills automatically, and self-improve evolves prompts and configs based on past performance. This is the closest thing I’ve seen in open-source to an agent that maintains itself.

Setup: from fork to first run

The flow is genuinely tight. Here’s what I did:

# 1. Clone and start the local dashboard
git clone https://github.com/aaronjmars/aeon
cd aeon && ./aeon
# Opens http://localhost:5555

In the dashboard:

Authenticate — paste an ANTHROPIC_API_KEY or run claude setup-token to get a 1-year OAuth token from your Claude Pro/Max subscription
Connect a channel — I added Telegram with TELEGRAM_BOT_TOKEN and TELEGRAM_CHAT_ID
Pick skills — toggled hacker-news-digest, github-trending, morning-brief, pr-review, plus the default heartbeat
Set schedules — daily 8am UTC for content, every 6h for pr-review
Push — one button commits aeon.yml and triggers git push

Then on the command line:

./onboard --remote
# Validates secrets, workflows, memory dir, and notification channel
# Posts a checklist to Telegram when it's done

./onboard is a small touch but it matters — it catches the half-configured forks where you forgot to add a secret, and it does the check inside Actions, so it’s verifying the same environment your skills will run in.

Within ~10 minutes of pushing, heartbeat fired and Telegram got its first message: HEARTBEAT_OK. The next morning at 08:00 UTC, the morning-brief skill landed in my chat, pulling from the prior 24h of hacker-news-digest and github-trending outputs because I’d chained them.

The skill chain feature

This is where Aeon gets interesting beyond “scheduled prompts.” Skills can be chained so their outputs flow into downstream skills. The relevant chunk of aeon.yml:

chains:
  morning-pipeline:
    schedule: "0 7 * * *"
    on_error: fail-fast    # or: continue
    steps:
      - parallel: [token-movers, hacker-news-digest]   # run concurrently
      - skill: morning-brief
        consume: [token-movers, hacker-news-digest]    # outputs injected

Behind the scenes, each step is a separate workflow dispatch. Outputs land in .outputs/{skill}.md and downstream steps with consume: get them injected into Claude’s context. on_error: fail-fast aborts on any failure; continue keeps going.

This isn’t novel architecturally — n8n, LangGraph, Temporal all do something similar — but the implementation is shockingly simple. There’s no orchestrator service. Just GitHub Actions workflows dispatching each other and writing markdown files to the repo. If a chain breaks, you git log to debug it.

The self-healing loop

This is the feature I expected to be marketing fluff. It mostly isn’t.

Every skill output is scored 1–5 by Claude Haiku after each run. The rubric is in the repo — failed/empty outputs → 1, excellent → 5. Scores accumulate in memory/skill-health/<skill>.json with a rolling 30-run history, plus flags like api_error, stale_data, rate_limited.

When a skill fails 3 times in a row (a “reactive trigger”), skill-repair auto-fires:

reactive:
  skill-repair:
    trigger:
      - { on: "*", when: "consecutive_failures >= 3" }

skill-repair reads the skill’s SKILL.md, the failure logs, the rolling score history, then edits the SKILL.md and commits a fix. I deliberately broke hacker-news-digest by giving it a malformed RSS URL — after 3 failures it patched the URL back to https://hnrss.org/frontpage and the next run succeeded. The patch commit is in the repo’s git history with a [skill-repair] prefix.

This is genuinely the most interesting thing Aeon does. It’s not perfect — skill-repair can’t fix logic bugs in its own scoring or runaway infinite loops, and it consumes Opus tokens to do the patching — but for transient API breakages and rate-limit flaps, it really does keep skills alive without intervention.

How Aeon compares to other agent frameworks

The README pits Aeon against Claude Code, Hermes, and OpenClaw on unattended scheduling, self-healing, output quality monitoring, persistent memory, and reactive triggers. That’s fair but apples-to-oranges — Claude Code and OpenClaw are interactive coding agents built around approval loops, not schedulers. The right comparisons:

vs. n8n: Aeon is more agentic (Claude reasons over each task) but less drag-and-drop. n8n wins for non-LLM workflows; Aeon wins when every step benefits from an LLM in the loop.
vs. LangGraph / CrewAI: Aeon doesn’t have a typed state-machine model. It’s prompts + cron + markdown files. Simpler to start, less safety net for complex multi-agent flows.
vs. self-hosted cron + scripts: Aeon adds quality scoring, self-repair, fleet management, and a dashboard. If you’ve ever cobbled together GitHub Actions + a shell script + Anthropic SDK to get a daily digest, Aeon is the polished version of that idea.

Community reaction

The Show HN thread went live May 15, 2026 with a mixed-but-mostly-positive response. Patterns from comments and dev.to coverage:

The wins. GitHub-Actions-as-runtime got the most love — “no infra to maintain” is real, since most agent frameworks die in production because nobody wants to babysit another Python process on a VPS. The 117-skill catalog also drew praise as a working-examples library.

The pushback. Biggest critique: token spend. With Opus 4.7 as default and Haiku for scoring, a moderately active fork (20 skills, mostly daily) burns $5–15/day, dropping to $2–5/day via the Bankr gateway. Fine for power users, prohibitive if you wanted “$10/month scheduled assistant.” Second concern: Actions minutes on private repos — public repos get unlimited free minutes, but most people fork private because memory/ contains personal notes, putting them in the standard Actions quota. Third: a few security folks flagged that ./add-skill can pull SKILL.md files from arbitrary GitHub repos; there’s a skill-security-scan meta-skill and ./add-skill runs a check, but the usual rule applies — read every SKILL.md before enabling it.

Honest limitations

After two days of actual use:

It’s loud by default. Every skill wants to notify you. Quieting the channel down to high-signal-only takes deliberate effort — there’s no global “only ping on quality < 3 or change-in-state” toggle yet.
Cold starts are slow. Every run is a fresh Actions checkout + Claude Code install. Median run time 2–10 minutes. Fine for daily digests, useless for “tell me about a price spike right now.”
Reactive triggers fire on the next 5-minute tick — not real-time by any stretch.
The dashboard is local-only. Editing aeon.yml from your phone means editing YAML in the GitHub mobile app.
No first-class non-Anthropic support yet. Bankr gateway adds GPT/Gemini/Kimi/Qwen, but core skills assume Claude’s tool-use conventions and degrade with swaps.
Memory grows unbounded. memory/ accumulates forever. A weekly-review skill summarizes but no automatic pruning.
Skill quality varies. Headline skills (morning-brief, hacker-news-digest, pr-review, deep-research) are polished. Some crypto/markets deep cuts are less battle-tested and score 2/5 out of the box.

Who should use Aeon

Good fit:

You already have a Claude Pro/Max subscription and want OAuth-billed agent work
You want a daily/weekly LLM-powered briefing on a topic, repo, market, or domain
You’re a solo dev or small team that wants PR reviews and security scans automated
You’re comfortable reading YAML and SKILL.md files
You want to experiment with self-healing agent loops without building one from scratch

Bad fit:

You need real-time response (sub-minute latency)
You have hard budget caps below ~$50/month for API spend on a moderately-used fork
You want a managed SaaS — Aeon is decidedly DIY
You don’t trust Claude Code to commit to your repo (it does, via GitHub Actions tokens)
You need fine-grained typed state machines for multi-agent workflows — use LangGraph

Quick start, recapped

# 1. Fork on GitHub, then:
git clone https://github.com/<you>/aeon
cd aeon

# 2. Local dashboard
./aeon
# → http://localhost:5555

# 3. In the dashboard:
#    - Paste CLAUDE_CODE_OAUTH_TOKEN (or ANTHROPIC_API_KEY)
#    - Connect Telegram/Discord/Slack
#    - Toggle skills, set schedules
#    - Push

# 4. Validate the setup ran end-to-end
./onboard --remote

# 5. Watch your first run land
gh run watch

Total time from git clone to first skill output in Telegram: about 12 minutes the first time, mostly waiting for GitHub Actions queue.

FAQ

Is Aeon free? The framework is MIT-licensed. The compute is free on public GitHub repos (unlimited Actions minutes). The LLM costs are not free — you’ll spend $2–15/day on Claude API tokens depending on how many skills you enable, dropping if you route through the Bankr LLM gateway or use Sonnet/Haiku for non-critical skills.

Can I use models other than Claude? Partially. The Bankr LLM gateway adds GPT, Gemini, Kimi, and Qwen access, set via gateway: { provider: bankr } in aeon.yml. But core skills are written against Claude’s tool-use conventions, so non-Claude models will degrade some skills. The default and most-tested model is claude-opus-4-7, with claude-sonnet-4-6 and claude-haiku-4-5 as cheaper alternates.

How does Aeon compare to Claude Code or OpenClaw? They solve different problems. Claude Code and OpenClaw are interactive — you sit at a TUI/IDE and approve actions. Aeon is unattended — you configure it once and it runs on a schedule. You’d typically use Claude Code or OpenClaw to write the skills, and Aeon to run them on cron without you watching.

Can Aeon write its own new skills? Yes, via the create-skill meta-skill. You give it a description (“watch for new packages in npm registry matching this pattern and DM me”) and it generates a SKILL.md, a manifest entry, and a aeon.yml config block as a PR you can merge. Quality is mixed — generated skills usually need a round of human edits before they’re production-ready.

Is it safe to give Aeon write access to my repo? Aeon commits to its own fork using GitHub Actions’ built-in GITHUB_TOKEN, scoped to that repository. It can’t reach other repos unless you give it a personal access token explicitly. The bigger risk is third-party skills imported via ./add-skill — read every SKILL.md before enabling it, and don’t put production secrets in the same fork as experimental skills.

What happens if a skill goes haywire and burns through tokens? The cost-report skill generates a weekly breakdown by skill and model, and memory/token-usage.csv tracks every run. The skill-health skill flags rate_limited and api_error patterns. There’s no hard budget enforcement at the framework level yet — if you set a $0/day budget at Anthropic, requests will start 429ing and skill-repair will (eventually) catch the pattern and pause the skill.

Does Aeon work with GitLab or self-hosted Git? Not yet. The framework is hardcoded to GitHub Actions for runtime and GitHub for git operations. A GitLab port would be possible but isn’t on the roadmap.

The verdict

Aeon is the first agent framework I’ve tried in 2026 where “set and forget” actually rings true after a few days of use. The GitHub-Actions-as-runtime decision is the right call — it eliminates the operational tax that kills most homegrown agent setups before they reach week two. The self-healing loop is novel and works for the failure modes it’s designed for.

The catches are real: token costs add up fast, default notification volume is excessive, cold start latency is 2–10 minutes, and the codebase is opinionated around Claude. If those constraints fit your use case, Aeon is the polished version of every “I should automate this with Claude on cron” idea you’ve had this year.

If you want to play, start small: fork, enable just heartbeat and one content skill, run for 48 hours, then decide what to add. The framework is good. The temptation to enable all 117 skills at once is the trap.

🔗 Repo: github.com/aaronjmars/aeon 🔗 Show HN article: dev.to/aaronjmars/aeon-the-background-ai-agent-that-runs-on-github-actions 🔗 License: MIT