Gateway

6. Gateway (Orchestrator)¶

The gateway is the central nervous system of mobyclaw. It's a long-lived process that:

Receives messages from all connected channels (Telegram, WhatsApp, CLI, webhooks)
Manages sessions — maps channels/users to conversation threads
Routes to the agent — sends prompts to cagent's HTTP API
Runs the scheduler — heartbeats and cron jobs trigger agent turns
Delivers responses — routes agent replies back to the right channel

6.1 Message Flow¶

User sends "What's my schedule?" via Telegram
  │
  ▼
gateway's Telegram adapter receives message
  │
  ├─ Look up session for telegram:dm:12345
  ├─ Load session history
  ├─ Enqueue in command queue (serialize per session)
  │
  ▼
gateway sends agent turn
  │
  ├─ POST http://moby:8080/v1/run
  │   { prompt: "What's my schedule?", session_id: "..." }
  │
  ▼
cagent runs agent loop
  │
  ├─ Assembles system prompt (soul.yaml instruction + context)
  ├─ Model inference
  ├─ Tool calls (reads calendar, writes memory, etc.)
  ├─ Final response: "You have a standup at 10am and..."
  │
  ▼
gateway receives response
  │
  ├─ Store in session history
  ├─ Route back to originating channel
  │
  ▼
gateway delivers response via Telegram adapter

6.2 Heartbeat Flow¶

Scheduler timer fires (every 30 minutes)
  │
  ├─ Is it within active hours? (e.g., 8am-11pm)
  │
  ▼
gateway sends heartbeat prompt to agent
  │
  ├─ POST http://moby:8080/v1/run
  │   { prompt: "Read HEARTBEAT.md. Follow it strictly.
  │              If nothing needs attention, reply HEARTBEAT_OK.",
  │     session_id: "heartbeat:main" }
  │
  ▼
cagent runs agent loop
  │
  ├─ Reads HEARTBEAT.md
  ├─ Checks pending tasks, reviews memory
  ├─ Either: "HEARTBEAT_OK" (nothing to do)
  │   Or: "Reminder: you have a meeting in 30 minutes"
  │
  ▼
gateway processes response
  │
  ├─ If HEARTBEAT_OK → suppress, don't deliver
  └─ If actual content → deliver to user's last active channel

See §6.7 for the full heartbeat design.

6.3 Cron Flow¶

Cron job fires: "Morning brief" (every day at 7am)
  │
  ▼
gateway creates isolated session
  │
  ├─ POST http://moby:8080/v1/run
  │   { prompt: "Summarize overnight updates. Check emails and calendar.",
  │     session_id: "cron:morning-brief" }
  │
  ▼
cagent runs agent loop
  │
  ├─ Reviews overnight activity, memory, etc.
  ├─ Composes summary
  │
  ▼
gateway delivers to configured channel
  │
  └─ Sends summary to user's WhatsApp/Telegram/Slack

6.4 Message Serialization¶

cagent can only process one request per session at a time. If the gateway sends a second message to the same session while the first is still running, the second request will hang until the first completes (or time out).

The gateway serializes messages per channel: - Each channel has a queue of pending messages - While a message is being processed (session is "busy"), new messages are queued - When processing completes, the next queued message is sent - If a session error occurs, the session is reset and the message retried once

This prevents concurrent requests to the same cagent session and ensures messages are processed in order.

6.5 Streaming Architecture¶

cagent's SSE stream emits agent_choice tokens as the model generates them. The gateway streams these tokens through to all consumers in real-time, making the agent feel fast even for long responses.

Streaming pipeline:

cagent SSE stream
  │
  │  agent_choice tokens (1-2s after request)
  ▼
agent-client.js (promptStream)
  │
  │  onToken(text) callback
  ▼
gateway routing (sendToAgentStream)
  │
  ├─→ POST /prompt/stream (SSE)  → CLI prints tokens to terminal
  ├─→ Telegram adapter           → edits message every ~1s
  └─→ POST /prompt (buffered)    → waits for full response (legacy)

Gateway SSE endpoint (POST /prompt/stream): - Returns text/event-stream with events: token, tool, done, error - Uses a PassThrough stream piped to the HTTP response - Critical: disconnect detection uses res.on('close'), NOT req.on('close') (the request close event fires immediately when the POST body is consumed, not when the client disconnects — this was a subtle bug)

Telegram streaming: Instead of waiting for the full response, the adapter: 1. Sends a placeholder message as soon as the first token arrives (~1-2s) 2. Edits that message every ~1.2s with accumulated text 3. Shows tool status ("⏳ Writing to memory...") during tool calls 4. Does a final edit when the stream completes

CLI streaming: mobyclaw run and mobyclaw chat connect to the SSE endpoint and print tokens directly to stdout as they arrive. Tool call status is shown on stderr so it doesn't pollute piped output.

6.6 Scheduler — Timed Reminders & Recurring Schedules¶

The scheduler is a gateway-side timer loop that delivers pre-composed messages at exact times. It does NOT involve the agent at delivery time — the agent composes the message upfront when creating the schedule.

Schedule API¶

The gateway exposes REST endpoints for schedule management. The agent calls these via curl (shell tool). The CLI and external tools can also use them.

Endpoint	Method	Purpose
`GET /api/schedules`	List	Returns pending schedules
`POST /api/schedules`	Create	Creates a new schedule
`DELETE /api/schedules/:id`	Cancel	Cancels a pending schedule

Create request body:

{
  "due": "2026-02-24T09:00:00Z",
  "message": "🔔 Hey! Reminder: **Buy groceries!**",
  "channel": "telegram:123456",
  "repeat": null
}

Either message or prompt is required (or both):

Field	When to use	At fire time
`message`	Simple reminders (content known upfront)	Delivered directly (free, instant)
`prompt`	Needs live data/reasoning (news, weather, summaries)	Sent to agent; agent’s response delivered
Both	Prompt-based with fallback	Agent runs; if it fails, `message` is delivered

Prompt-based example (agent runs at fire time):

{
  "due": "2026-02-24T09:00:00Z",
  "prompt": "Fetch the latest tech news and write a brief morning briefing.",
  "channel": "telegram:123456",
  "repeat": "weekdays"
}

Schedule object (stored):

{
  "id": "sch_a1b2c3",
  "due": "2026-02-24T09:00:00Z",
  "message": "🔔 Hey! Reminder: **Buy groceries!**",
  "channel": "telegram:123456",
  "status": "pending",
  "repeat": null,
  "created_at": "2026-02-23T20:15:00Z",
  "delivered_at": null
}

Status values: pending → delivered | cancelled

Persistence: ~/.mobyclaw/schedules.json — bind-mounted, survives restarts, user-visible. Gateway reads/writes this file.

Repeat / Recurring Schedules¶

The repeat field controls recurrence:

Value	Meaning	Example
`null`	One-shot (default)	"Remind me tomorrow at 9am"
`"daily"`	Every day at the same time	"Remind me every day at 9am"
`"weekdays"`	Mon–Fri at the same time	"Every weekday morning"
`"weekly"`	Same day+time each week	"Every Monday at 9am"
`"monthly"`	Same day+time each month	"First of every month"
`"0 7 * * 1-5"`	Cron expression	Full cron flexibility

When a recurring schedule fires: 1. Gateway delivers the message 2. Marks current entry as delivered 3. Computes next occurrence from the repeat rule 4. Creates a new pending entry with the next due time

The original entry's repeat value is copied to the new entry, creating an ongoing chain. Cancelling the latest pending entry stops the chain.

Scheduler Loop¶

Runs every 30 seconds inside the gateway:

Every 30 seconds:
  │
  ├─ Read schedules.json
  ├─ Find entries where due <= now AND status == "pending"
  │
  ├─ For each due schedule:
  │   ├─ Parse channel (e.g., "telegram:123456")
  │   ├─ Call adapter's send function via delivery API
  │   ├─ Mark status = "delivered", set delivered_at
  │   ├─ If repeat: create next pending entry
  │   └─ Save schedules.json
  │
  └─ Done (< 1ms for most runs)

Delivery API¶

Internal gateway endpoint for sending proactive messages to any channel:

POST /api/deliver
{
  "channel": "telegram:123456",
  "message": "🔔 Reminder text"
}

Parses the channel prefix (telegram, discord, slack, etc.)
Routes to the appropriate adapter's proactive send function
Returns success/failure
Bypasses session management — this is a direct push, not an agent turn

Adapter registry: Gateway maintains a map of platform → send function. Each adapter registers itself on startup:

const adapters = {
  telegram: { send: (chatId, message) => bot.telegram.sendMessage(chatId, message) },
  // discord: { send: ... },
  // slack: { send: ... },
};

How the Agent Creates a Schedule¶

When the user says "remind me tomorrow at 9am to buy groceries":

User (Telegram): "Remind me tomorrow at 9am to buy groceries"
  │
  ├─ Gateway prepends channel context (see §6.8)
  │
  ▼
Agent processes message
  │
  ├─ 1. Create schedule via gateway API:
  │     curl -s -X POST http://gateway:3000/api/schedules \
  │       -H "Content-Type: application/json" \
  │       -d '{"due":"2026-02-24T09:00:00Z",
  │            "message":"🔔 Hey! Reminder: Buy groceries!",
  │            "channel":"telegram:123456"}'
  │
  ├─ 2. Write to TASKS.md for tracking:
  │     "- [ ] 2026-02-24 09:00 — Buy groceries [scheduled]"
  │
  └─ 3. Respond: "Got it! I'll remind you tomorrow at 9am. ✅"

6.7 Heartbeat — Periodic Agent Wake-Up¶

The heartbeat is an intelligent periodic check where the agent wakes up, reviews its state, and acts if needed. Unlike the scheduler (dumb timer, pre-composed message), the heartbeat involves full LLM reasoning.

Trigger: Gateway timer, every MOBYCLAW_HEARTBEAT_INTERVAL (default: 15m)

Active hours: Only fires between MOBYCLAW_ACTIVE_HOURS (default: 07:00-23:00). Silent outside these hours. Scheduled reminders always fire regardless of active hours.

Heartbeat prompt (sent by gateway to agent):

[HEARTBEAT | time=2026-02-24T09:03:00Z]
You are being woken by a scheduled heartbeat.

1. Read TASKS.md — review your task list, note anything relevant
2. Read HEARTBEAT.md — follow the checklist
3. If you need to notify the user about something, use:
   curl -s -X POST http://gateway:3000/api/deliver \
     -H "Content-Type: application/json" \
     -d '{"channel": "CHANNEL_ID", "message": "YOUR MESSAGE"}'
4. If nothing needs attention, reply exactly: HEARTBEAT_OK

Heartbeat flow:

Gateway timer fires (every 15 minutes)
  │
  ├─ Check active hours (07:00-23:00) → skip if outside
  │
  ├─ Send heartbeat prompt to agent (session: "heartbeat:main")
  │
  ▼
Agent processes heartbeat
  │
  ├─ Reads TASKS.md
  │   ├─ Reviews open tasks
  │   ├─ Marks completed items
  │   └─ Cleans up old entries
  │
  ├─ Reads HEARTBEAT.md
  │   ├─ Follows checklist items
  │   └─ Daily tasks (once per day)
  │
  ├─ If something needs user attention:
  │   └─ curl POST http://gateway:3000/api/deliver ...
  │
  └─ Response:
      ├─ "HEARTBEAT_OK" → gateway suppresses, logs quietly
      └─ Summary text → gateway logs it

Why the agent uses /api/deliver instead of just responding: The heartbeat runs on a system session (heartbeat:main), not a user channel. The agent's response goes nowhere useful. For the agent to reach the user, it explicitly calls the delivery API with the target channel. This gives the agent control over WHERE to send (different tasks may target different channels).

6.8 Channel Context Injection¶

For the agent to know which channel a message came from (needed when creating schedules), the gateway prepends a context line to every user message:

[context: channel=telegram:123456, time=2026-02-23T20:15:00Z]
Remind me tomorrow at 9am to buy groceries

The agent's instruction tells it to: - Extract the channel ID when creating schedules or timed tasks - Include the channel in schedule API calls and TASKS.md entries - Never display the context line to the user - Ask the user which channel to use if they request a reminder from a non-messaging channel (e.g., CLI) and multiple channels are available

For heartbeat prompts, no channel context is included (it's a system session, not a user message).

Why in the message, not metadata? cagent's API doesn't support per-message metadata fields. The user message content is the only field we control. A bracketed prefix is simple, reliable, and the LLM easily parses it.

6.9 TASKS.md — Agent's Task Store¶

TASKS.md lives at ~/.mobyclaw/TASKS.md. It's a Markdown file the agent uses to track reminders, todos, and recurring tasks.

# Tasks

> Moby's task and reminder list. Moby manages this file.
> You can also edit it directly.

## Reminders

- [ ] 2026-02-24 09:00 — Buy groceries (channel:telegram:123456) [scheduled]
- [ ] 2026-02-24 14:00 — Call the dentist (channel:telegram:123456) [scheduled]
- [x] ~~2026-02-23 15:00 — Send report to Alice~~ (delivered)

## Recurring

- [ ] weekdays 07:00 — Morning briefing (channel:telegram:123456) [scheduled]

## Todo

- [ ] Review PR #1234 on myapp
- [ ] Research vector databases for memory search
- [x] ~~Set up workspace mounts~~

Design: - Flexible Markdown — agent uses LLM intelligence to interpret - [scheduled] marker — indicates a gateway schedule was created (prevents double-scheduling on heartbeat) - Channel stored per-task — reminders go back to the originating channel - Todos without times — just tracked, agent mentions in heartbeat if relevant - Agent marks [x] when done, may clean up old entries

6.10 Known Channels (Persistent)¶

The gateway persists known messaging channels to ~/.mobyclaw/channels.json. When the first message arrives from any messaging platform, the gateway saves that channel. This means:

Schedules can omit the channel field — the gateway defaults to the known channel for that platform
Heartbeat includes known channels and the default channel in its prompt, so the agent knows where to deliver notifications
Survives restarts — the file is on the bind-mounted host filesystem
Agent can read it directly at /home/agent/.mobyclaw/channels.json or query GET /api/channels

File format (~/.mobyclaw/channels.json):

{
  "telegram": "telegram:1436415037",
  "discord": "discord:9876543210"
}

One entry per platform. For a personal agent, there's typically one chat per platform (your DM with the bot). If the user messages from a different chat on the same platform, the channel is updated.

API endpoint:

GET /api/channels
→ { "channels": { "telegram": "telegram:123" }, "default": "telegram:123" }

Default channel resolution (used by schedule API and heartbeat): 1. Last active channel in current session (in-memory) 2. First known channel from channels.json 3. null (schedule API returns 400, heartbeat skips delivery)

Debian slim + cagent binary + common dev tools (git, curl, jq, etc.)

Design decisions: - Debian slim over Alpine: better compatibility with cagent and dev tools - cagent installed at build time: pinned version for reproducibility - Common tools included: git, curl, jq, ripgrep — agents need these for shell tool execution - Non-root user: agent runs as agent user (uid 1000) for security - Workspace at /workspace: standard mount point for all agents

7.2 Agent Entrypoint¶

cagent serve api /agent/soul.yaml --working-dir /workspace

The container: 1. Starts cagent in API server mode 2. Loads the agent config from /agent/soul.yaml 3. Sets the working directory to /workspace (mounted from host) 4. Listens on port 8080 5. Serves the agent API (send prompts, get responses, manage sessions)

Tool approval: cagent serve api requires explicit tool approval per session. When creating a session via POST /api/sessions, the gateway MUST set {"tools_approved": true} in the request body. Without this, the SSE stream will pause at tool_call_confirmation events and wait indefinitely for client-side approval that never comes. This was a critical bug discovered during development — the agent would respond to simple messages (no tools) but hang forever on any message that triggered a tool call (e.g., writing to memory). The fix is a single field on session creation.

7.3 cagent HTTP API Reference¶

Discovered through testing. This is the API surface of cagent serve api:

Endpoint	Method	Purpose
`/api/ping`	GET	Health check. Returns `{"status":"ok"}`
`/api/agents`	GET	List available agents. Returns `[{"name":"soul",...}]`
`/api/sessions`	GET	List all sessions
`/api/sessions`	POST	Create session. Body: `{"tools_approved": true}`. Returns session object with `id`.
`/api/sessions/{id}`	GET	Get session details and message history
`/api/sessions/{id}/agent/{name}`	POST	Send messages to agent. Body: `[{"role":"user","content":"..."}]`. Returns SSE stream.

Agent name resolution: The {name} in the agent endpoint comes from the config filename (e.g., soul.yaml → agent name is soul), NOT from the name: field in the YAML or the agents map key. This is a cagent convention.

SSE stream event types:

Event Type	When	Contains
`agent_info`	Start of stream	Agent name, model, welcome message
`team_info`	Start of stream	Available agents list
`toolset_info`	Start of stream	Number of available tools
`stream_started`	Agent begins processing	Session ID
`agent_choice_reasoning`	During inference (thinking)	Reasoning text (extended thinking)
`agent_choice`	During inference	Response text tokens — this is the actual reply
`partial_tool_call`	Tool being called	Tool name and partial arguments (streaming)
`tool_call_confirmation`	Tool awaiting approval	Only if `tools_approved: false` — blocks stream
`tool_result`	After tool execution	Tool output
`message_added`	Message persisted	Session ID
`token_usage`	After each model turn	Input/output tokens, cost
`session_title`	Auto-generated	Session title from content
`stream_stopped`	End of stream	Session ID
`error`	On failure	Error message

Multi-turn tool streams: A single SSE stream may contain multiple model turns. When the model calls a tool, the stream continues through: agent_choice_reasoning → partial_tool_call → (tool executes) → tool_result → agent_choice (final response). The gateway must read the entire stream to collect all agent_choice content.

7.4 Volume Mounts¶

Mount	Type	Container Path	Purpose
`~/.mobyclaw/`	Bind mount	`/home/agent/.mobyclaw`	All agent state: memory, soul, sessions, logs
Project root (`.`)	Bind mount	`/source`	Full source code access (self-modification)
Agent config	Bind mount (ro)	`/agent/`	Agent YAML (from repo)

Key principle: Everything lives at ~/.mobyclaw/ on the host. No Docker volumes. This means: - All state persists across container restarts - cp -r ~/.mobyclaw/ backup/ is a complete backup - docker system prune won't destroy anything

7.4 Secrets & Environment Variables¶

All secrets and configuration live in a single .env file at the project root. Docker Compose loads it via env_file and injects variables into the right containers.

Strategy¶

One .env file — single place for all secrets. No scattered config.
.env.example — checked into git with placeholder values. Users copy to .env and fill in their keys.
.env is gitignored — never committed. .gitignore includes .env from day one.
No secrets baked into images — the Dockerfile never COPYs .env or ARGs secrets. They're injected at runtime via Compose.
Least-privilege distribution — each container only receives the env vars it needs. The agent container gets LLM API keys. The gateway gets messaging tokens. Neither gets the other's secrets.

Why `.env` file (not Docker Secrets, Vault, etc.)¶

Mobyclaw is a personal agent on your own machine. Docker Secrets requires Swarm mode. Vault/SOPS/etc. add operational complexity for zero benefit when you're the only user. A .env file is: - Simple: one file, cp .env.example .env, edit, done - Standard: Docker Compose native support, every dev knows it - Portable: copy .env to a new machine alongside ~/.mobyclaw/ - Secure enough: file permissions (chmod 600 .env), gitignored, never in images

If someone deploys mobyclaw on a shared server or CI, they can use their platform's native secret injection (GitHub Actions secrets, systemd credentials, etc.) — those just set env vars, which Compose picks up the same way.

Variable Reference¶

Variable	Container	Required	Purpose
`ANTHROPIC_API_KEY`	moby	Yes (if using Anthropic)	Anthropic model access
`OPENAI_API_KEY`	moby	Yes (if using OpenAI)	OpenAI model access
`TELEGRAM_BOT_TOKEN`	gateway	No	Enables Telegram adapter
`DISCORD_BOT_TOKEN`	gateway	No	Enables Discord adapter
`SLACK_BOT_TOKEN`	gateway	No	Enables Slack adapter
`WHATSAPP_AUTH`	gateway	No	Enables WhatsApp adapter
`MOBYCLAW_HEARTBEAT_INTERVAL`	gateway	No	Heartbeat frequency (default: `15m`)
`MOBYCLAW_ACTIVE_HOURS`	gateway	No	Active hours for heartbeat (default: `07:00-23:00`)
`MOBYCLAW_HOME`	all	No	Override `~/.mobyclaw/` path

Convention: Messaging adapter tokens double as feature flags — if TELEGRAM_BOT_TOKEN is unset, the Telegram adapter simply doesn't load. No token = no adapter = no error.

Least-Privilege Distribution in Compose¶

services:
  moby:
    environment:
      - ANTHROPIC_API_KEY         # LLM keys only
      - OPENAI_API_KEY
    # NO messaging tokens

  gateway:
    environment:
      - TELEGRAM_BOT_TOKEN        # Messaging tokens only
      - DISCORD_BOT_TOKEN
      - SLACK_BOT_TOKEN
      - WHATSAPP_AUTH
      - MOBYCLAW_HEARTBEAT_INTERVAL
    # NO LLM API keys

The .env file holds everything, but Compose's per-service environment block controls which container sees which variable. This way, a compromised gateway can't leak your Anthropic key, and a compromised agent can't access your Telegram bot.

`.env.example` Template¶

# ─── LLM Provider Keys ───────────────────────────────────────
# At least one is required. Uncomment and fill in.
ANTHROPIC_API_KEY=
# OPENAI_API_KEY=

# ─── Messaging (all optional) ────────────────────────────────
# Set a token to enable that channel. No token = adapter disabled.
# TELEGRAM_BOT_TOKEN=
# DISCORD_BOT_TOKEN=
# SLACK_BOT_TOKEN=
# WHATSAPP_AUTH=

# ─── Agent Settings ──────────────────────────────────────────
# MOBYCLAW_HOME=~/.mobyclaw
# MOBYCLAW_HEARTBEAT_INTERVAL=30m

File Permissions¶

mobyclaw init sets chmod 600 .env after creating it. The .env file contains API keys worth money — it should only be readable by the owner.