Streaming + chunking
OpenClaw has two separate βstreamingβ layers:
- Block streaming (channels): emit completed blocks as the assistant writes. These are normal channel messages (not token deltas).
- Token-ish streaming (Telegram only): update a draft bubble with partial text while generating; final message is sent at the end.
There is no real token streaming to external channel messages today. Telegram draft streaming is the only partial-stream surface.
Block streaming (channel messages)
Block streaming sends assistant output in coarse chunks as it becomes available.
Model output
ββ text_delta/events
ββ (blockStreamingBreak=text_end)
β ββ chunker emits blocks as buffer grows
ββ (blockStreamingBreak=message_end)
ββ chunker flushes at message_end
ββ channel send (block replies)
Legend:
- text_delta/events: model stream events (may be sparse for non-streaming models).
- chunker: EmbeddedBlockChunker applying min/max bounds + break preference.
- channel send: actual outbound messages (block replies).
Controls:
- agents.defaults.blockStreamingDefault: "on"/"off" (default off).
- Channel overrides: *.blockStreaming (and per-account variants) to force "on"/"off" per channel.
- agents.defaults.blockStreamingBreak: "text_end" or "message_end".
- agents.defaults.blockStreamingChunk: { minChars, maxChars, breakPreference? }.
- agents.defaults.blockStreamingCoalesce: { minChars?, maxChars?, idleMs? } (merge streamed blocks before send).
- Channel hard cap: *.textChunkLimit (e.g., channels.whatsapp.textChunkLimit).
- Channel chunk mode: *.chunkMode (length default, newline splits on blank lines (paragraph boundaries) before length chunking).
- Discord soft cap: channels.discord.maxLinesPerMessage (default 17) splits tall replies to avoid UI clipping.
Boundary semantics:
- text_end: stream blocks as soon as chunker emits; flush on each text_end.
- message_end: wait until assistant message finishes, then flush buffered output.
message_end still uses the chunker if the buffered text exceeds maxChars, so it can emit multiple chunks at the end.
Chunking algorithm (low/high bounds)
Block chunking is implemented by EmbeddedBlockChunker:
- Low bound: donβt emit until buffer >= minChars (unless forced).
- High bound: prefer splits before maxChars; if forced, split at maxChars.
- Break preference: paragraph β newline β sentence β whitespace β hard break.
- Code fences: never split inside fences; when forced at maxChars, close + reopen the fence to keep Markdown valid.
maxChars is clamped to the channel textChunkLimit, so you canβt exceed per-channel caps.
Coalescing (merge streamed blocks)
When block streaming is enabled, OpenClaw can merge consecutive block chunks before sending them out. This reduces βsingle-line spamβ while still providing progressive output.
- Coalescing waits for idle gaps (idleMs) before flushing.
- Buffers are capped by maxChars and will flush if they exceed it.
- minChars prevents tiny fragments from sending until enough text accumulates (final flush always sends remaining text).
- Joiner is derived from blockStreamingChunk.breakPreference (paragraph β \n\n, newline β \n, sentence β space).
- Channel overrides are available via *.blockStreamingCoalesce (including per-account configs).
- Default coalesce minChars is bumped to 1500 for Signal/Slack/Discord unless overridden.
Human-like pacing between blocks
When block streaming is enabled, you can add a randomized pause between block replies (after the first block). This makes multi-bubble responses feel more natural.
- Config: agents.defaults.humanDelay (override per agent via agents.list[].humanDelay).
- Modes: off (default), natural (800β2500ms), custom (minMs/maxMs).
- Applies only to block replies, not final replies or tool summaries.
βStream chunks or everythingβ
This maps to:
- Stream chunks: blockStreamingDefault: "on" + blockStreamingBreak: "text_end" (emit as you go). Non-Telegram channels also need *.blockStreaming: true.
- Stream everything at end: blockStreamingBreak: "message_end" (flush once, possibly multiple chunks if very long).
- No block streaming: blockStreamingDefault: "off" (only final reply).
Channel note: For non-Telegram channels, block streaming is off unless *.blockStreaming is explicitly set to true. Telegram can stream drafts (channels.telegram.streamMode) without block replies.
Config location reminder: the blockStreaming* defaults live under agents.defaults, not the root config.
Telegram draft streaming (token-ish)
Telegram is the only channel with draft streaming:
- Uses Bot API sendMessageDraft in private chats with topics.
- channels.telegram.streamMode: "partial" | "block" | "off".
- partial: draft updates with the latest stream text.
- block: draft updates in chunked blocks (same chunker rules).
- off: no draft streaming.
- Draft chunk config (only for streamMode: "block"): channels.telegram.draftChunk (defaults: minChars: 200, maxChars: 800).
- Draft streaming is separate from block streaming; block replies are off by default and only enabled by *.blockStreaming: true on non-Telegram channels.
- Final reply is still a normal message.
- /reasoning stream writes reasoning into the draft bubble (Telegram only).
When draft streaming is active, OpenClaw disables block streaming for that reply to avoid double-streaming.
Telegram (private + topics)
ββ sendMessageDraft (draft bubble)
ββ streamMode=partial β update latest text
ββ streamMode=block β chunker updates draft
ββ final reply β normal message
Legend:
- sendMessageDraft: Telegram draft bubble (not a real message).
- final reply: normal Telegram message send.