Gateway service runbook
Last updated: 2025-12-09
What it is
- The always-on process that owns the single Baileys/Telegram connection and the control/event plane.
- Replaces the legacy gateway command. CLI entry point: openclaw gateway.
- Runs until stopped; exits non-zero on fatal errors so the supervisor restarts it.
How to run (local)
openclaw gateway --port 18789
# for full debug/trace logs in stdio:
openclaw gateway --port 18789 --verbose
# if the port is busy, terminate listeners then start:
openclaw gateway --force
# dev loop (auto-reload on TS changes):
pnpm gateway:watch
- Config hot reload watches ~/.openclaw/openclaw.json (or OPENCLAW_CONFIG_PATH).
- Default mode: gateway.reload.mode="hybrid" (hot-apply safe changes, restart on critical).
- Hot reload uses in-process restart via SIGUSR1 when needed.
- Disable with gateway.reload.mode="off".
- Binds WebSocket control plane to 127.0.0.1:<port> (default 18789).
- The same port also serves HTTP (control UI, hooks, A2UI). Single-port multiplex.
- OpenAI Chat Completions (HTTP): /v1/chat/completions.
- OpenResponses (HTTP): /v1/responses.
- Tools Invoke (HTTP): /tools/invoke.
- Starts a Canvas file server by default on canvasHost.port (default 18793), serving http://<gateway-host>:18793/__openclaw__/canvas/ from ~/.openclaw/workspace/canvas. Disable with canvasHost.enabled=false or OPENCLAW_SKIP_CANVAS_HOST=1.
- Logs to stdout; use launchd/systemd to keep it alive and rotate logs.
- Pass --verbose to mirror debug logging (handshakes, req/res, events) from the log file into stdio when troubleshooting.
- --force uses lsof to find listeners on the chosen port, sends SIGTERM, logs what it killed, then starts the gateway (fails fast if lsof is missing).
- If you run under a supervisor (launchd/systemd/mac app child-process mode), a stop/restart typically sends SIGTERM; older builds may surface this as pnpm ELIFECYCLE exit code 143 (SIGTERM), which is a normal shutdown, not a crash.
- SIGUSR1 triggers an in-process restart when authorized (gateway tool/config apply/update, or enable commands.restart for manual restarts).
- Gateway auth is required by default: set gateway.auth.token (or OPENCLAW_GATEWAY_TOKEN) or gateway.auth.password. Clients must send connect.params.auth.token/password unless using Tailscale Serve identity.
- The wizard now generates a token by default, even on loopback.
- Port precedence: --port > OPENCLAW_GATEWAY_PORT > gateway.port > default 18789.
Remote access
- Tailscale/VPN preferred; otherwise SSH tunnel:
ssh -N -L 18789:127.0.0.1:18789 user@host - Clients then connect to ws://127.0.0.1:18789 through the tunnel.
- If a token is configured, clients must include it in connect.params.auth.token even over the tunnel.
Multiple gateways (same host)
Usually unnecessary: one Gateway can serve multiple messaging channels and agents. Use multiple Gateways only for redundancy or strict isolation (ex: rescue bot).
Supported if you isolate state + config and use unique ports. Full guide: Multiple gateways.
Service names are profile-aware:
- macOS: bot.molt.<profile> (legacy com.openclaw.* may still exist)
- Linux: openclaw-gateway-<profile>.service
- Windows: OpenClaw Gateway (<profile>)
Install metadata is embedded in the service config:
- OPENCLAW_SERVICE_MARKER=openclaw
- OPENCLAW_SERVICE_KIND=gateway
- OPENCLAW_SERVICE_VERSION=<version>
Rescue-Bot Pattern: keep a second Gateway isolated with its own profile, state dir, workspace, and base port spacing. Full guide: Rescue-bot guide.
Dev profile (--dev)
Fast path: run a fully-isolated dev instance (config/state/workspace) without touching your primary setup.
openclaw --dev setup
openclaw --dev gateway --allow-unconfigured
# then target the dev instance:
openclaw --dev status
openclaw --dev health
Defaults (can be overridden via env/flags/config):
- OPENCLAW_STATE_DIR=~/.openclaw-dev
- OPENCLAW_CONFIG_PATH=~/.openclaw-dev/openclaw.json
- OPENCLAW_GATEWAY_PORT=19001 (Gateway WS + HTTP)
- browser control service port = 19003 (derived: gateway.port+2, loopback only)
- canvasHost.port=19005 (derived: gateway.port+4)
- agents.defaults.workspace default becomes ~/.openclaw/workspace-dev when you run setup/onboard under --dev.
Derived ports (rules of thumb):
- Base port = gateway.port (or OPENCLAW_GATEWAY_PORT / --port)
- browser control service port = base + 2 (loopback only)
- canvasHost.port = base + 4 (or OPENCLAW_CANVAS_HOST_PORT / config override)
- Browser profile CDP ports auto-allocate from browser.controlPort + 9 .. + 108 (persisted per profile).
Checklist per instance:
- unique gateway.port
- unique OPENCLAW_CONFIG_PATH
- unique OPENCLAW_STATE_DIR
- unique agents.defaults.workspace
- separate WhatsApp numbers (if using WA)
Service install per profile:
openclaw --profile main gateway install
openclaw --profile rescue gateway install
Example:
OPENCLAW_CONFIG_PATH=~/.openclaw/a.json OPENCLAW_STATE_DIR=~/.openclaw-a openclaw gateway --port 19001
OPENCLAW_CONFIG_PATH=~/.openclaw/b.json OPENCLAW_STATE_DIR=~/.openclaw-b openclaw gateway --port 19002
Protocol (operator view)
- Full docs: Gateway protocol and Bridge protocol (legacy).
- Mandatory first frame from client: req {type:"req", id, method:"connect", params:{minProtocol,maxProtocol,client:{id,displayName?,version,platform,deviceFamily?,modelIdentifier?,mode,instanceId?}, caps, auth?, locale?, userAgent? } }.
- Gateway replies res {type:"res", id, ok:true, payload:hello-ok } (or ok:false with an error, then closes).
- After handshake:
- Requests: {type:"req", id, method, params} β {type:"res", id, ok, payload|error}
- Events: {type:"event", event, payload, seq?, stateVersion?}
- Structured presence entries: {host, ip, version, platform?, deviceFamily?, modelIdentifier?, mode, lastInputSeconds?, ts, reason?, tags?[], instanceId? } (for WS clients, instanceId comes from connect.client.instanceId).
- agent responses are two-stage: first res ack {runId,status:"accepted"}, then a final res {runId,status:"ok"|"error",summary} after the run finishes; streamed output arrives as event:"agent".
Methods (initial set)
- health β full health snapshot (same shape as openclaw health --json).
- status β short summary.
- system-presence β current presence list.
- system-event β post a presence/system note (structured).
- send β send a message via the active channel(s).
- agent β run an agent turn (streams events back on same connection).
- node.list β list paired + currently-connected nodes (includes caps, deviceFamily, modelIdentifier, paired, connected, and advertised commands).
- node.describe β describe a node (capabilities + supported node.invoke commands; works for paired nodes and for currently-connected unpaired nodes).
- node.invoke β invoke a command on a node (e.g. canvas.*, camera.*).
- node.pair.* β pairing lifecycle (request, list, approve, reject, verify).
See also: Presence for how presence is produced/deduped and why a stable client.instanceId matters.
Events
- agent β streamed tool/output events from the agent run (seq-tagged).
- presence β presence updates (deltas with stateVersion) pushed to all connected clients.
- tick β periodic keepalive/no-op to confirm liveness.
- shutdown β Gateway is exiting; payload includes reason and optional restartExpectedMs. Clients should reconnect.
WebChat integration
- WebChat is a native SwiftUI UI that talks directly to the Gateway WebSocket for history, sends, abort, and events.
- Remote use goes through the same SSH/Tailscale tunnel; if a gateway token is configured, the client includes it during connect.
- macOS app connects via a single WS (shared connection); it hydrates presence from the initial snapshot and listens for presence events to update the UI.
Typing and validation
- Server validates every inbound frame with AJV against JSON Schema emitted from the protocol definitions.
- Clients (TS/Swift) consume generated types (TS directly; Swift via the repoβs generator).
- Protocol definitions are the source of truth; regenerate schema/models with:
- pnpm protocol:gen
- pnpm protocol:gen:swift
Connection snapshot
- hello-ok includes a snapshot with presence, health, stateVersion, and uptimeMs plus policy {maxPayload,maxBufferedBytes,tickIntervalMs} so clients can render immediately without extra requests.
- health/system-presence remain available for manual refresh, but are not required at connect time.
Error codes (res.error shape)
- Errors use { code, message, details?, retryable?, retryAfterMs? }.
- Standard codes:
- NOT_LINKED β WhatsApp not authenticated.
- AGENT_TIMEOUT β agent did not respond within the configured deadline.
- INVALID_REQUEST β schema/param validation failed.
- UNAVAILABLE β Gateway is shutting down or a dependency is unavailable.
Keepalive behavior
- tick events (or WS ping/pong) are emitted periodically so clients know the Gateway is alive even when no traffic occurs.
- Send/agent acknowledgements remain separate responses; do not overload ticks for sends.
Replay / gaps
- Events are not replayed. Clients detect seq gaps and should refresh (health + system-presence) before continuing. WebChat and macOS clients now auto-refresh on gap.
Supervision (macOS example)
- Use launchd to keep the service alive:
- Program: path to openclaw
- Arguments: gateway
- KeepAlive: true
- StandardOut/Err: file paths or syslog
- On failure, launchd restarts; fatal misconfig should keep exiting so the operator notices.
- LaunchAgents are per-user and require a logged-in session; for headless setups use a custom LaunchDaemon (not shipped).
- openclaw gateway install writes ~/Library/LaunchAgents/bot.molt.gateway.plist (or bot.molt.<profile>.plist; legacy com.openclaw.* is cleaned up).
- openclaw doctor audits the LaunchAgent config and can update it to current defaults.
Gateway service management (CLI)
Use the Gateway CLI for install/start/stop/restart/status:
openclaw gateway status
openclaw gateway install
openclaw gateway stop
openclaw gateway restart
openclaw logs --follow
Notes:
- gateway status probes the Gateway RPC by default using the serviceβs resolved port/config (override with --url).
- gateway status --deep adds system-level scans (LaunchDaemons/system units).
- gateway status --no-probe skips the RPC probe (useful when networking is down).
- gateway status --json is stable for scripts.
- gateway status reports supervisor runtime (launchd/systemd running) separately from RPC reachability (WS connect + status RPC).
- gateway status prints config path + probe target to avoid βlocalhost vs LAN bindβ confusion and profile mismatches.
- gateway status includes the last gateway error line when the service looks running but the port is closed.
- logs tails the Gateway file log via RPC (no manual tail/grep needed).
- If other gateway-like services are detected, the CLI warns unless they are OpenClaw profile services.
We still recommend one gateway per machine for most setups; use isolated profiles/ports for redundancy or a rescue bot. See Multiple gateways.
- Cleanup: openclaw gateway uninstall (current service) and openclaw doctor (legacy migrations).
- gateway install is a no-op when already installed; use openclaw gateway install --force to reinstall (profile/env/path changes).
Bundled mac app:
- OpenClaw.app can bundle a Node-based gateway relay and install a per-user LaunchAgent labeled bot.molt.gateway (or bot.molt.<profile>; legacy com.openclaw.* labels still unload cleanly).
- To stop it cleanly, use openclaw gateway stop (or launchctl bootout gui/$UID/bot.molt.gateway).
- To restart, use openclaw gateway restart (or launchctl kickstart -k gui/$UID/bot.molt.gateway).
- launchctl only works if the LaunchAgent is installed; otherwise use openclaw gateway install first.
- Replace the label with bot.molt.<profile> when running a named profile.
Supervision (systemd user unit)
OpenClaw installs a systemd user service by default on Linux/WSL2. We recommend user services for single-user machines (simpler env, per-user config). Use a system service for multi-user or always-on servers (no lingering required, shared supervision).
openclaw gateway install writes the user unit. openclaw doctor audits the unit and can update it to match the current recommended defaults.
Create ~/.config/systemd/user/openclaw-gateway[-<profile>].service:
[Unit]
Description=OpenClaw Gateway (profile: <profile>, v<version>)
After=network-online.target
Wants=network-online.target
[Service]
ExecStart=/usr/local/bin/openclaw gateway --port 18789
Restart=always
RestartSec=5
Environment=OPENCLAW_GATEWAY_TOKEN=
WorkingDirectory=/home/youruser
[Install]
WantedBy=default.target
Enable lingering (required so the user service survives logout/idle):
sudo loginctl enable-linger youruser
Onboarding runs this on Linux/WSL2 (may prompt for sudo; writes /var/lib/systemd/linger). Then enable the service:
systemctl --user enable --now openclaw-gateway[-<profile>].service
Alternative (system service) - for always-on or multi-user servers, you can install a systemd system unit instead of a user unit (no lingering needed). Create /etc/systemd/system/openclaw-gateway[-<profile>].service (copy the unit above, switch WantedBy=multi-user.target, set User= + WorkingDirectory=), then:
sudo systemctl daemon-reload
sudo systemctl enable --now openclaw-gateway[-<profile>].service
Windows (WSL2)
Windows installs should use WSL2 and follow the Linux systemd section above.
Operational checks
- Liveness: open WS and send req:connect β expect res with payload.type="hello-ok" (with snapshot).
- Readiness: call health β expect ok: true and a linked channel in linkChannel (when applicable).
- Debug: subscribe to tick and presence events; ensure status shows linked/auth age; presence entries show Gateway host and connected clients.
Safety guarantees
- Assume one Gateway per host by default; if you run multiple profiles, isolate ports/state and target the right instance.
- No fallback to direct Baileys connections; if the Gateway is down, sends fail fast.
- Non-connect first frames or malformed JSON are rejected and the socket is closed.
- Graceful shutdown: emit shutdown event before closing; clients must handle close + reconnect.
CLI helpers
- openclaw gateway health|status β request health/status over the Gateway WS.
- openclaw message send --target <num> --message "hi" [--media ...] β send via Gateway (idempotent for WhatsApp).
- openclaw agent --message "hi" --to <num> β run an agent turn (waits for final by default).
- openclaw gateway call <method> --params '{"k":"v"}' β raw method invoker for debugging.
- openclaw gateway stop|restart β stop/restart the supervised gateway service (launchd/systemd).
- Gateway helper subcommands assume a running gateway on --url; they no longer auto-spawn one.
Migration guidance
- Retire uses of openclaw gateway and the legacy TCP control port.
- Update clients to speak the WS protocol with mandatory connect and structured presence.