Decisions

13. Key Architectural Decisions Log¶

#	Decision	Rationale	Date
ADR-001	Use cagent native YAML, no wrapper format	Zero translation layer, users get full cagent features	2026-02-23
ADR-002	`soul.yaml` as single identity file per agent	Simpler than OpenClaw's 6+ bootstrap files. Can add more via `add_prompt_files`	2026-02-23
ADR-003	`cagent serve api` as primary container entrypoint	HTTP API is the natural interface for containerized agents	2026-02-23
ADR-004	Bash CLI, not compiled binary	Minimal dependencies (docker, curl, jq). Ship fast, iterate.	2026-02-23
ADR-005	Debian slim base image	Better cagent/tool compat than Alpine. Acceptable size trade-off.	2026-02-23
ADR-006	`mobyclaw.yaml` is dev-only, not product config	Separation of concerns: dev agent ≠ product agent	2026-02-23
ADR-007	"moby" as the default/reference agent	Clear identity, easy onboarding, extensible pattern	2026-02-23
ADR-008	Docker Compose over Kubernetes	Right-sized for personal agent deployment. K8s is overkill.	2026-02-23
ADR-009	Delegate agent loop entirely to cagent	Focus on orchestration, not reimplementing inference + tool execution	2026-02-23
ADR-010	Memory as plain Markdown files (OpenClaw pattern)	Simple, portable, agent can read/write with filesystem tools. No DB needed.	2026-02-23
ADR-011	Gateway as separate container from agent	Clean separation: gateway handles I/O + routing, agent handles thinking + acting	2026-02-23
ADR-012	Messaging adapters inside gateway, not separate containers	Simpler (one container), all JS libs anyway, enable/disable via env vars. Matches OpenClaw.	2026-02-23
ADR-013	Docker volumes for persistence	Workspace (memory) and data (sessions, cron) survive container restarts	2026-02-23
ADR-014	4-service separation: moby, gateway, workspace, memory	Each concern in its own container. Clean ownership. Independent scaling/failure.	2026-02-23
ADR-015	Workspace + memory as MCP servers	cagent's `type: mcp` toolset connects moby to services. No direct host mounts on agent.	2026-02-23
ADR-016	Separate workspace and memory volumes	Workspace = host files (projects, code). Memory = agent state (MEMORY.md, daily logs). Different lifecycles, different owners.	2026-02-23
ADR-017	`~/.mobyclaw/` as user data directory, bind-mounted	User-visible, editable, portable, survives `docker system prune`. Not a Docker volume.	2026-02-23
ADR-018	Messaging adapters inside gateway, not separate bridge containers	Simpler, less config, matches OpenClaw. Enable via env var presence.	2026-02-23
ADR-019	Single agent only — no multi-agent support	Mobyclaw is a personal agent, not a platform. One agent (moby), one container. Simplifies routing, config, and mental model. Can always revisit.	2026-02-23
ADR-020	Sessions created with `tools_approved: true`	`cagent serve api` pauses at `tool_call_confirmation` unless the session has `tools_approved: true`. Gateway sets this on session creation. Container isolation provides the safety boundary.	2026-02-23
ADR-021	`.env` file for secrets management	Single file, Docker Compose native, no Swarm/Vault needed. Least-privilege: per-service `environment` blocks control which container sees which var.	2026-02-23
ADR-022	End-to-end streaming via SSE PassThrough	cagent emits tokens in real-time. Gateway streams them through via PassThrough piped to HTTP response. Critical: use `res.on('close')` not `req.on('close')` for disconnect detection. Telegram adapter edits message every ~1s. CLI prints tokens to stdout.	2026-02-23
ADR-023	`docker-compose.override.yml` for per-user config	Base compose stays static + git-committed. Override is auto-generated from `credentials.env` + `workspaces.conf` on every `mobyclaw up`. Docker Compose merges them automatically. Gitignored.	2026-02-23
ADR-024	Separate `credentials.env` from `.env`	`.env` = mobyclaw infra (LLM keys, messaging). `credentials.env` = user service tokens (gh, aws). Different owners, different lifecycle. credentials.env lives in `~/.mobyclaw/` (portable with agent state).	2026-02-23
ADR-025	Workspaces as host bind mounts via `workspaces.conf`	Simple `name=path` format in `~/.mobyclaw/workspaces.conf`. CLI manages it (`workspace add/remove/list`). Override generation maps to Docker volumes. Changes require restart.	2026-02-23
ADR-026	Gateway-side scheduler with agent-created schedules via REST API	Agent calls `POST /api/schedules` via curl. Gateway owns timing, persistence, and delivery. Separation: agent composes messages, gateway delivers at the right time. No agent involvement at fire time (pre-composed messages).	2026-02-23
ADR-027	Heartbeat as periodic agent prompt, separate from scheduler	Scheduler = precise dumb timer (30s resolution). Heartbeat = intelligent agent review (15m interval). Different concerns: scheduler delivers pre-composed messages; heartbeat invokes full LLM reasoning. Agent uses `/api/deliver` to proactively message users from heartbeat.	2026-02-23
ADR-028	TASKS.md as agent-managed task store (Markdown)	Flexible Markdown file. Agent writes entries via filesystem tools. `[scheduled]` marker prevents double-scheduling. Channel stored per-task. Heartbeat reviews it. Complements schedules.json (gateway-owned) — TASKS.md is the agent's view, schedules.json is the gateway's execution state.	2026-02-23
ADR-029	Channel context injected as message prefix by gateway	Gateway prepends `[context: channel=telegram:123, time=...]` to every user message. Only mechanism available since cagent API has no per-message metadata. Agent extracts channel for schedule creation. Never displayed to user.	2026-02-23
ADR-030	Last active channel for fallback delivery	Gateway tracks last messaging channel used. Fallback when heartbeat/agent needs to deliver without a specific channel target. Resets on restart (acceptable for personal agent).	2026-02-23
ADR-031	Source code mounted at `/source` for self-modification	Agent needs to modify its own Dockerfile, gateway source, compose config, CLI, and documentation. Bind-mounting the project root gives full read-write access. Safety via: git (revert), permission-before-modify policy, syntax checks before rebuild. Four signal types: `restart`, `rebuild`, `rebuild-gateway`, `rebuild-all`.	2026-02-23
ADR-032	Persistent channel store	`ChannelStore` persists known channels to `~/.mobyclaw/channels.json` (one entry per platform). Saved on first message. Schedule API falls back to known channel. Heartbeat includes known channels in prompt. Replaces old in-memory `lastActiveChannel`.	2026-02-24
ADR-033	Schedule pruning — splice-on-delivery	`markDelivered()` and `cancel()` splice entries out of array. `_load()` filters to only `pending` on startup. `schedules.json` only ever contains pending entries. Prevents unbounded growth.	2026-02-24
ADR-034	Heartbeat skip guard	`let running = false` flag prevents heartbeat overlap. If previous heartbeat still running, next tick skips. Uses `try/finally` to reset. Prevents infinite queue buildup at 30s intervals.	2026-02-24
ADR-035	Collect queue mode (OpenClaw-inspired)	Default queue mode coalesces rapid queued messages into a single combined turn. Prevents "continue, continue" spam. Messages separated by `---`. All promises resolve with the same response. Configurable via `QUEUE_MODE` env var.	2026-02-24
ADR-036	Typing indicators on message receipt	Telegram adapter sends `sendChatAction('typing')` immediately when a message is received, before any processing. Refresh every 4s while processing. OpenClaw pattern: `instant` mode. Makes agent feel responsive even during queue waits.	2026-02-24
ADR-037	Queue feedback to user	When message is queued behind a running task, user sees "⏳ Working on something else, I'll get to this next..." Telegram message. Deleted automatically when processing starts. SSE endpoint emits `queued` event. Visible acknowledgment prevents confusion.	2026-02-24
ADR-038	Session daily/idle reset	Sessions auto-reset at configurable hour (default 4 AM) and/or after idle timeout. OpenClaw pattern: daily reset clears stale context, idle reset catches long gaps. `/new` and `/reset` commands force immediate reset. Persisted `lastActivity` timestamp survives restarts.	2026-02-24
ADR-039	/stop abort command	`/stop` in Telegram (or `POST /api/stop`) clears the queue and signals abort on the current run. Returns count of cleared messages. Graceful: doesn't crash the agent, just ends the current turn.	2026-02-24
ADR-040	Queue cap with oldest-drop overflow	Max 20 queued messages (configurable). When cap exceeded, oldest message is dropped with error. Prevents unbounded memory growth from spam or runaway loops. OpenClaw uses summarize policy; we use simple drop for now.	2026-02-24
ADR-041	Debounce on queue drain	1000ms debounce before draining collected messages (collect mode only). Lets rapid messages accumulate before the agent processes them as one turn. Configurable via `QUEUE_DEBOUNCE_MS`.	2026-02-24
ADR-042	Tool Gateway as MCP aggregator in separate container	External service access (Notion, Google, etc.) routed through a dedicated `tool-gateway` container. Manages upstream MCP connections, auth, and token lifecycle independently. Exposes aggregated tools as a single MCP server to cagent via HTTP bridge. Clean separation: agent doesn't know about OAuth, tokens, or MCP wiring.	2026-02-24
ADR-043	Chat-mediated auth for all external services	No CLI commands, no admin UIs for auth. All OAuth/device-code flows are initiated conversationally — user says "connect notion", agent sends auth URL via Telegram, user clicks and authorizes, agent confirms. Mirrors how `gh auth login` worked (Moby sent the device code via Telegram). For OAuth redirect flows (Notion), tool-gateway hosts callback endpoint.	2026-02-24
ADR-044	mcp-bridge: stdio-to-HTTP relay for cagent → tool-gateway	cagent only supports MCP via stdio (`command` + `args`). Tool-gateway runs in a separate container with HTTP. Bridge script in moby container translates stdio ↔ HTTP. ~50 lines, shell or Go. Allows clean container separation while keeping native MCP tool discovery.	2026-02-24
ADR-045	CLI tools (gh, git, curl) installed directly in agent container	If a service has a solid CLI, skip the MCP layer. `gh` already in moby container. Agent uses via shell toolset. Simpler, fewer moving parts. MCP reserved for services that need structured tool schemas or complex auth.	2026-02-24
ADR-046	Zod schemas required for McpServer.tool()	MCP SDK v1.27.0's `McpServer.tool()` requires Zod schema objects, not plain JSON Schema `{type:"string"}`. `isZodRawShapeCompat()` silently rejects plain objects → empty `inputSchema.properties`. All tool definitions (tool-gateway + mcp-bridge re-registration) must use `z.string()`, `z.number()`, etc.	2026-02-24
ADR-047	zod installed globally in moby container	mcp-bridge runs inside moby and needs zod to convert JSON Schema → Zod when re-registering remote tools. Added `zod` to `npm install -g` in Dockerfile alongside `@modelcontextprotocol/sdk`. Bridge uses `NODE_PATH` auto-discovery for global modules.	2026-02-24
ADR-048	Full Playwright browser in tool-gateway	Headless Chromium via Playwright in tool-gateway container for full web interaction (navigate, click, type, fill forms, screenshots). Uses Playwright’s internal `_snapshotForAI()` for accessibility snapshots with aria-ref element targeting — same approach as `@playwright/mcp`. Single persistent browser context with 10min idle auto-close. Browser is ~400MB but enables account creation, multi-step flows, CAPTCHA viewing via screenshots.	2026-02-24
ADR-049	Accessibility snapshots over screenshots for interaction	Agent uses text-based accessibility tree (with ref IDs) to understand and interact with pages. Screenshots are secondary — useful for visual verification (CAPTCHAs, layout) but you "can’t perform actions based on screenshots." Refs change after every action; agent must use refs from the most recent snapshot. Matches Playwright MCP’s design philosophy.	2026-02-24
ADR-050	Recursive JSON Schema → Zod in mcp-bridge	Bridge now handles nested types: arrays (`z.array()`), objects (`z.object()`), enums (`z.enum()`), not just primitives. Required for `browser_fill_form` (array of field objects) and `browser_tabs` (enum action). Single recursive `jsonSchemaToZod()` function.	2026-02-24
ADR-051	Agent max_iterations raised to 15	Browser automation tasks require many sequential tool calls (navigate → snapshot → fill → click → wait → snapshot → ...). The default 5 iterations was too low. 15 allows a realistic multi-page flow while still preventing runaway loops.	2026-02-24