Introduction
ChatHermes is an autonomous-agent SaaS built on Nous Research's Hermes 4 and Moonshot AI's Kimi K2 Thinking — the official sponsors of the Hermes Agent Creative Hackathon. It pairs streaming multi-model chat with persistent memory, real tools, and agent infrastructure that lets a paid user spin up their own dedicated server. The same code that runs chathermes.com is in this repo.
This documentation covers everything you need to: install ChatHermes, configure it, use the public REST API, build skills, integrate Telegram, deploy a private agent, and self-host the whole thing.
Quickstart
Three install paths. Pick one.
Option 1 — Docker Compose
The recommended path. One command, works on macOS, Linux, Windows / WSL.
git clone https://github.com/ai-co-id/chathermes.git cd chathermes ./bin/setup.sh # interactive .env wizard (60s) docker compose up -d # → open http://localhost:7000
Option 2 — Hetzner one-click
If you already have a Hetzner Cloud token. Cloud-init bootstraps Bun + Docker + clones repo + writes .env + runs docker compose up.
curl -X POST https://your-self-host.com/api/deploy/hetzner \
-H "Content-Type: application/json" \
-d '{
"token": "hcloud_...",
"server_type": "cx22",
"location": "nbg1",
"llm_keys": { "nous": "..." }
}'
# Returns: { ok: true, ip, url, ssh_command }Option 3 — Bun runtime
No Docker. Requires Bun ≥ 1.3 and Node ≥ 22.
# Orchestrator cd orchestrator bun install bun run src/index.ts & # Web (in another terminal) cd web bun install bun run build PORT=7000 ORCH_URL=http://127.0.0.1:7010 bun run start
Configuration
ChatHermes reads everything from environment variables. The minimum viable .env:
PUBLIC_BASE_URL=http://localhost:7000 SESSION_SECRET=$(openssl rand -hex 32) NOUS_API_KEY=hf_... # OR another LLM provider key — pick one
Everything else degrades gracefully. See the full env var reference below.
Provider keys
You need at least one LLM provider configured. ChatHermes supports:
| Provider | Env var | What you get |
|---|---|---|
| Moonshot AI ★ | KIMI_API_KEY | Kimi K2, K2 Thinking — strong reasoning, hackathon co-sponsor |
| Nous Research ★ | NOUS_API_KEY | Hermes 4 (405B + 70B), Hermes 3 — open weights, hackathon host |
| Anthropic | ANTHROPIC_API_KEY | Claude Sonnet 4.6 — best for code |
| OpenAI | OPENAI_API_KEY | GPT-5 — best general |
GEMINI_API_KEY | Gemini 3.1 Pro — best vision | |
| StepFun | STEPFUN_API_KEY | Step 3.5 Flash — fast/cheap |
Tools
The agent has 14 tools available during chat. All are real APIs — no mockups. Tools are called via <tool_call> XML blocks in the assistant response and parsed by the orchestrator.
| Name | What it does | Backend |
|---|---|---|
| web_search(query) | 5-tier web search fallback | Tavily → Brave → DuckDuckGo HTML → Wikipedia → DDG instant |
| browse(url) | Visit URL + extract main content | fetch + article/main extraction |
| fetch_url(url) | Raw HTTP fetch | lower-level than browse() |
| github_repo("owner/name") | Repo metadata | GitHub REST API |
| news_search(query) | Recent news headlines | Google News RSS |
| weather(location) | Live weather + forecast | open-meteo + geocoding |
| wikipedia(topic) | Encyclopedia summary | Wikipedia REST API |
| save_memory(topic, body) | Persist a fact across sessions | SQLite, per-user |
| recall_memory(query) | Search saved memories | SQLite full-text |
| telegram_send(message) | Push to user's Telegram bot | via /app/connectors token |
| run_js(code) | Execute JavaScript expression | in-process VM |
| generate_image(prompt) | Create image from text | Replicate Flux Schnell |
| analyze_image(url, question) | Vision analysis | Gemini 2.0 Flash → GPT-4o fallback |
| dispatch_subagent(task, model) | Delegate to a different model | Claude / GPT-5 / Kimi via providers DB |
TOOLS array in orchestrator/src/tools.ts, 2) add a case in executeTool() for the dispatch logic, 3) update the system prompt in orchestrator/src/index.ts with the rule for when to call it. PRs welcome.Skills
Skills are user-toggleable capabilities listed at /app/skills. The system prompt mentions which are active for the user; the agent uses that context to decide when to invoke tools.
Twelve skills ship enabled by default:
- Research — web_search + browse + cite
- Code & Build — vibe coding at
/app/projects, publish to/p/<slug> - Persistent memory — save_memory / recall_memory
- Scheduler — natural-language cron at
/app/schedules - Telegram push — telegram_send via connector
- Subagent dispatch — Claude / GPT-5 / Kimi parallel reasoning
- Image generation — Flux via Replicate
- Vision analysis — Gemini 2.0 Flash + GPT-4o fallback
- Browser — browse() + fetch_url()
- GitHub recon — github_repo()
- Weather + News — open-meteo + Google News RSS
- Run JS — calculations, regex, JSON parsing
Memory
Memory is per-user, scoped via WHERE user_id = ? on every query. The schema:
CREATE TABLE memories ( id TEXT PRIMARY KEY, user_id TEXT NOT NULL, topic TEXT NOT NULL, body TEXT NOT NULL, created_at INTEGER NOT NULL ); CREATE INDEX idx_memories_user ON memories(user_id, created_at DESC);
Memories are surfaced to the agent two ways:
- System prompt context — the latest 20 memories are injected into every chat turn's system message.
- recall_memory tool — the agent calls this explicitly when it needs to look something up before answering personal questions.
Users can read, edit, and delete memories at /app/memory. No black box.
Vibe coding
"Build me a landing page for X" → agent generates HTML/CSS/JS that renders live in a sandboxed iframe. Three modes:
| Mode | What it generates | When to use |
|---|---|---|
| static | single index.html (Tailwind CDN allowed) | marketing pages, simple UIs |
| spa | React via esm.sh CDN, single component tree | interactive prototypes |
| fullstack | frontend + mock API backend in iframe | demo apps with state |
Per-plan project quotas enforce free-tier limits. POST /api/me/projects returns 402 if the user is over their monthly cap. GET /api/me/projects/quota returns:
{
"plan": "free",
"used_this_month": 3,
"limit_per_month": 5,
"is_unlimited": false,
"remaining": 2,
"pct": 60,
"lifetime_total": 12,
"lifetime_published": 4
}Projects can be published at /p/<slug> for public preview. The orchestrator injects a "Made with ChatHermes" floating badge into published HTML (this is required by the license).
REST API
The public REST API is Bearer-authenticated with API keys created at /app/api-keys. OpenAPI 3.0.3 spec available at /api/openapi.json.
Authentication
curl https://your-self-host.com/api/v1/me \ -H "Authorization: Bearer ck_..."
Core endpoints
| Method | Path | What it does |
|---|---|---|
| GET | /api/v1/me | User profile + plan |
| GET | /api/v1/sessions | List chat sessions |
| POST | /api/v1/sessions | Create a new chat session |
| POST | /api/v1/sessions/:id/chat | Stream a chat completion (SSE) |
| GET | /api/v1/memories | List your memories |
| POST | /api/v1/memories | Save a memory |
| GET | /api/v1/projects | List your vibe-coding projects |
| GET | /api/v1/projects/:id | Project detail + history |
| GET | /api/v1/usage | Current period usage stats |
Webhooks
ChatHermes can POST to your endpoints when events fire. Configured at /app/webhooks. Each delivery is HMAC-signed using your webhook secret.
Verifying signatures
import crypto from "node:crypto";
const sig = req.headers["x-chathermes-signature"];
const body = await readRawBody(req);
const expected = crypto
.createHmac("sha256", WEBHOOK_SECRET)
.update(body)
.digest("hex");
if (sig !== expected) {
res.status(401).send("invalid signature");
}Events
| Event | Fires when |
|---|---|
| session.message.created | User or assistant adds a message |
| project.published | Vibe-coding project goes live |
| memory.created | save_memory tool runs |
| subscription.changed | Stripe subscription state changes |
| agent.ready | Private Hermes Agent provisioning succeeds |
Failed deliveries retry with exponential backoff (3 attempts: ~1s, ~5s, ~30s). After that, the delivery is logged at /admin/email for manual replay.
Connectors
Connectors at /app/connectors let users wire ChatHermes into external services. Currently shipping: Telegram.
Telegram
- Talk to @BotFather on Telegram →
/newbot→ save the token. - Open ChatHermes → Connectors → Telegram → paste token → Save.
- In Telegram, find your new bot →
/start— this binds your Telegram user to your ChatHermes account. - The agent can now call
telegram_send(message)tool to push you messages.
Schedules
Natural-language cron at /app/schedules. Examples:
- "Daily briefing at 9am — top 5 stories from Hacker News"
- "Every 6 hours — check competitor.com/pricing, notify if changed"
- "Sundays at 8am — summarize what I worked on this week, email it"
The orchestrator parses the natural-language schedule into a cron expression and a target tool chain. When it fires, the agent runs as if the schedule's prompt were sent in a fresh chat.
Private Agents
Free-tier users share a single Hermes Agent proxy on the orchestrator's :19002. Paid-tier users can have a dedicated Hetzner server provisioned for them — fully isolated CPU, fully isolated rate limits, fully isolated tool tokens.
Architecture
// resolveHermesEndpoint() in private_agent.ts // Free user → shared :19002 (this server) // Paid user + status=ready → user's dedicated server // Paid user + status=pending/provisioning → falls back to shared
Provisioning flow
- Stripe webhook fires
customer.subscription.createdwith active paid status. - Orchestrator marks the user as
private_agent_status = pending. - Gated mode (default): an admin opens
/admin/private-agents, clicks Provision per user. - Auto mode: set
AUTO_PROVISION_PRIVATE_AGENT=truein.env; webhook spawns immediately. - Hetzner Cloud API spawns a fresh server with cloud-init that installs Bun + a per-user proxy + auth token.
- Orchestrator polls readiness; when
:19002answers, status flips toready. - Future
hermes-agentrequests for that user route to their endpoint.
Hetzner deploy
The deploy logic lives in orchestrator/src/deploy.ts. Two surfaces:
Public one-click — /deploy/hetzner
Anyone with a Hetzner token can paste it, pick a server type and region, paste at least one LLM API key, and click Deploy ChatHermes. Cloud-init bootstraps the new server in around 90 seconds.
Admin fleet management — /admin/hetzner
Admins set the Hetzner token once via the UI (stored in system_settings), then manage all servers in their account: list, status, power on/off/reboot, delete. Servers labeled app=chathermes are highlighted as managed.
Self-hosting
Before going live with self-hosted ChatHermes:
- HTTPS — set up nginx + Certbot. See INSTALL.md Path 3.
- SESSION_SECRET — generate with
openssl rand -hex 32. Don't reuse the example value. - ADMIN_EMAILS — set in
.envto grant your email admin role on first signup. - Backups — set up cron to
sqlite3 .backupthe SQLite file daily, and rsync off-server. - Resend — verify your domain at resend.com/domains for magic links.
- Stripe — set up products + webhook endpoint. See INSTALL.md "Optional integrations".
- Hermes Agent native — for the
hermes-agentmodel option, installnous-hermes-agentseparately. The orchestrator's shared proxy works without it (proxies to Nous API directly). - Required Attribution — don't strip
_attribution.ts. The runtime guard refuses to start without it. This is the license.
Env vars
Every environment variable. Copy orchestrator/.env.example to orchestrator/.env and fill what you need.
| Variable | Required? | Purpose |
|---|---|---|
| PUBLIC_BASE_URL | yes | Your install's public URL (used in magic links, OG, webhooks) |
| SESSION_SECRET | yes | Cookie signing key — openssl rand -hex 32 |
| NODE_ENV | no | production | development |
| ADMIN_EMAILS | no | Comma-separated emails granted admin on signup |
| DATA_ROOT | no | Where SQLite + tenant volumes live (default ./data) |
| NOUS_API_KEY | 1+ needed | Hermes 4, Hermes 3, Kimi K2 (Nous proxies multiple) |
| KIMI_API_KEY | 1+ needed | Direct Moonshot Kimi |
| ANTHROPIC_API_KEY | 1+ needed | Claude Sonnet 4.6 |
| OPENAI_API_KEY | 1+ needed | GPT-5 |
| GEMINI_API_KEY | 1+ needed | Gemini 3.1 Pro |
| STEPFUN_API_KEY | 1+ needed | Step 3.5 Flash |
| DEFAULT_MODEL | no | Default model_id when user has no preference |
| CHATHERMES_MODEL_RATES | no | JSON: {model_id: credits_per_1k_tokens}. Default empty. |
| CHATHERMES_DEFAULT_RATE | no | Fallback rate for models not in the rates JSON |
| RESEND_API_KEY | no | If absent, magic links print to log instead of email |
| RESEND_FROM | no | From address — must be verified in Resend |
| RESEND_REPLY_TO | no | Reply-to address |
| RESEND_WEBHOOK_SECRET | no | For verifying inbound delivery events |
| STRIPE_SECRET_KEY | no | If absent, billing UI shows demo banner |
| STRIPE_PUBLISHABLE_KEY | no | Frontend Stripe.js |
| STRIPE_WEBHOOK_SECRET | no | For verifying inbound subscription events |
| STRIPE_PRICE_PRO | no | Stripe price_xxx for Pro plan |
| STRIPE_PRICE_TEAM | no | Stripe price_xxx for Team plan |
| AUTO_PROVISION_PRIVATE_AGENT | no | false (default — admin gates) | true (full auto) |
| HETZNER_API_TOKEN | no | Alternative to setting in /admin/hetzner UI |
| HERMES_PROXY_PORT | no | Default 19002 |
| HERMES_UPSTREAM_MODEL | no | What the shared proxy forwards to (default Hermes-4-405B) |
| HERMES_RATE_LIMIT | no | Requests/min per IP on the shared proxy (default 60) |
| REPLICATE_API_TOKEN | no | Image generation tool |
| TAVILY_API_KEY | no | Web search tier 1 (best quality) |
| BRAVE_API_KEY | no | Web search tier 2 |
License
ChatHermes is licensed under the ChatHermes Open Source License v1.0 — AGPL-3.0 with a Required Attribution Addendum and Trademark Reservation.
_attribution.ts identifying constants.The runtime guard in orchestrator/src/_attribution.ts refuses to start the orchestrator if the attribution module is tampered. See /opensource for full terms.