docs/concepts/queue.md

---
summary: "Command queue design that serializes inbound auto-reply runs"
read_when:
  - Changing auto-reply execution or concurrency
title: "Command Queue"
---

# Command Queue (2026-01-16)

We serialize inbound auto-reply runs (all channels) through a tiny in-process queue to prevent multiple agent runs from colliding, while still allowing safe parallelism across sessions.

## Why

- Auto-reply runs can be expensive (LLM calls) and can collide when multiple inbound messages arrive close together.
- Serializing avoids competing for shared resources (session files, logs, CLI stdin) and reduces the chance of upstream rate limits.

## How it works

- A lane-aware FIFO queue drains each lane with a configurable concurrency cap (default 1 for unconfigured lanes; main defaults to 4, subagent to 8).
- `runEmbeddedPiAgent` enqueues by **session key** (lane `session:<key>`) to guarantee only one active run per session.
- Each session run is then queued into a **global lane** (`main` by default) so overall parallelism is capped by `agents.defaults.maxConcurrent`.
- When verbose logging is enabled, queued runs emit a short notice if they waited more than ~2s before starting.
- Typing indicators still fire immediately on enqueue (when supported by the channel) so user experience is unchanged while we wait our turn.

## Queue modes (per channel)

Inbound messages can steer the current run, wait for a followup turn, or do both:

- `steer`: inject immediately into the current run (cancels pending tool calls after the next tool boundary). If not streaming, falls back to followup.
- `followup`: enqueue for the next agent turn after the current run ends.
- `collect`: coalesce all queued messages into a **single** followup turn (default). If messages target different channels/threads, they drain individually to preserve routing.
- `steer-backlog` (aka `steer+backlog`): steer now **and** preserve the message for a followup turn.
- `interrupt` (legacy): abort the active run for that session, then run the newest message.
- `queue` (legacy alias): same as `steer`.

Steer-backlog means you can get a followup response after the steered run, so
streaming surfaces can look like duplicates. Prefer `collect`/`steer` if you want
one response per inbound message.
Send `/queue collect` as a standalone command (per-session) or set `messages.queue.byChannel.discord: "collect"`.

Defaults (when unset in config):

- All surfaces → `collect`

Configure globally or per channel via `messages.queue`:

```json5
{
  messages: {
    queue: {
      mode: "collect",
      debounceMs: 1000,
      cap: 20,
      drop: "summarize",
      byChannel: { discord: "collect" },
    },
  },
}
```

## Queue options

Options apply to `followup`, `collect`, and `steer-backlog` (and to `steer` when it falls back to followup):

- `debounceMs`: wait for quiet before starting a followup turn (prevents “continue, continue”).
- `cap`: max queued messages per session.
- `drop`: overflow policy (`old`, `new`, `summarize`).

Summarize keeps a short bullet list of dropped messages and injects it as a synthetic followup prompt.
Defaults: `debounceMs: 1000`, `cap: 20`, `drop: summarize`.

## Per-session overrides

- Send `/queue <mode>` as a standalone command to store the mode for the current session.
- Options can be combined: `/queue collect debounce:2s cap:25 drop:summarize`
- `/queue default` or `/queue reset` clears the session override.

## Scope and guarantees

- Applies to auto-reply agent runs across all inbound channels that use the gateway reply pipeline (WhatsApp web, Telegram, Slack, Discord, Signal, iMessage, webchat, etc.).
- Default lane (`main`) is process-wide for inbound + main heartbeats; set `agents.defaults.maxConcurrent` to allow multiple sessions in parallel.
- Additional lanes may exist (e.g. `cron`, `subagent`) so background jobs can run in parallel without blocking inbound replies.
- Per-session lanes guarantee that only one agent run touches a given session at a time.
- No external dependencies or background worker threads; pure TypeScript + promises.

## Troubleshooting

- If commands seem stuck, enable verbose logs and look for “queued for …ms” lines to confirm the queue is draining.
- If you need queue depth, enable verbose logs and watch for queue timing lines.
docs: add docs:list helper and front matter 2025-12-09 17:51:05 +00:00			`---`
fix: hard-abort clears queues on /stop 2026-01-16 21:15:25 +00:00			`summary: "Command queue design that serializes inbound auto-reply runs"`
docs: add docs:list helper and front matter 2025-12-09 17:51:05 +00:00			`read_when:`
			`- Changing auto-reply execution or concurrency`
Docs: add nav titles across docs (#5689) 2026-01-31 16:04:03 -05:00			`title: "Command Queue"`
docs: add docs:list helper and front matter 2025-12-09 17:51:05 +00:00			`---`
chore: Run `pnpm format:fix`. 2026-01-31 21:13:13 +09:00
fix: hard-abort clears queues on /stop 2026-01-16 21:15:25 +00:00			`# Command Queue (2026-01-16)`
feat: serialize command auto-replies with queue 2025-11-25 04:40:49 +01:00
fix: hard-abort clears queues on /stop 2026-01-16 21:15:25 +00:00			`We serialize inbound auto-reply runs (all channels) through a tiny in-process queue to prevent multiple agent runs from colliding, while still allowing safe parallelism across sessions.`
feat: serialize command auto-replies with queue 2025-11-25 04:40:49 +01:00
			`## Why`
chore: Run `pnpm format:fix`. 2026-01-31 21:13:13 +09:00
fix: hard-abort clears queues on /stop 2026-01-16 21:15:25 +00:00			`- Auto-reply runs can be expensive (LLM calls) and can collide when multiple inbound messages arrive close together.`
			`- Serializing avoids competing for shared resources (session files, logs, CLI stdin) and reduces the chance of upstream rate limits.`
feat: serialize command auto-replies with queue 2025-11-25 04:40:49 +01:00
			`## How it works`
chore: Run `pnpm format:fix`. 2026-01-31 21:13:13 +09:00
chore: raise default agent concurrency 2026-01-20 10:06:42 +00:00			`- A lane-aware FIFO queue drains each lane with a configurable concurrency cap (default 1 for unconfigured lanes; main defaults to 4, subagent to 8).`
fix(agent): serialize runs per session 2025-12-25 23:50:52 +01:00			- `runEmbeddedPiAgent` enqueues by session key (lane `session:<key>`) to guarantee only one active run per session.
feat: wire multi-agent config and routing Co-authored-by: Mark Pors <1078320+pors@users.noreply.github.com> 2026-01-09 12:44:23 +00:00			- Each session run is then queued into a global lane (`main` by default) so overall parallelism is capped by `agents.defaults.maxConcurrent`.
fix: hard-abort clears queues on /stop 2026-01-16 21:15:25 +00:00			`- When verbose logging is enabled, queued runs emit a short notice if they waited more than ~2s before starting.`
			`- Typing indicators still fire immediately on enqueue (when supported by the channel) so user experience is unchanged while we wait our turn.`
feat: serialize command auto-replies with queue 2025-11-25 04:40:49 +01:00
docs: complete channels rename sweep 2026-01-13 07:15:57 +00:00			`## Queue modes (per channel)`
chore: Run `pnpm format:fix`. 2026-01-31 21:13:13 +09:00
feat: expand queue modes and followup backlog 2026-01-03 04:26:36 +01:00			`Inbound messages can steer the current run, wait for a followup turn, or do both:`
chore: Run `pnpm format:fix`. 2026-01-31 21:13:13 +09:00
feat: expand queue modes and followup backlog 2026-01-03 04:26:36 +01:00			- `steer`: inject immediately into the current run (cancels pending tool calls after the next tool boundary). If not streaming, falls back to followup.
			- `followup`: enqueue for the next agent turn after the current run ends.
fix: hard-abort clears queues on /stop 2026-01-16 21:15:25 +00:00			- `collect`: coalesce all queued messages into a single followup turn (default). If messages target different channels/threads, they drain individually to preserve routing.
feat: expand queue modes and followup backlog 2026-01-03 04:26:36 +01:00			- `steer-backlog` (aka `steer+backlog`): steer now and preserve the message for a followup turn.
			- `interrupt` (legacy): abort the active run for that session, then run the newest message.
			- `queue` (legacy alias): same as `steer`.
feat(queue): add queue modes and discord gating 2025-12-26 13:35:44 +01:00
fix: harden block stream dedupe 2026-01-03 18:44:07 +01:00			`Steer-backlog means you can get a followup response after the steered run, so`
			streaming surfaces can look like duplicates. Prefer `collect`/`steer` if you want
			`one response per inbound message.`
docs: complete channels rename sweep 2026-01-13 07:15:57 +00:00			Send `/queue collect` as a standalone command (per-session) or set `messages.queue.byChannel.discord: "collect"`.
fix: harden block stream dedupe 2026-01-03 18:44:07 +01:00
feat(queue): add queue modes and discord gating 2025-12-26 13:35:44 +01:00			`Defaults (when unset in config):`
chore: Run `pnpm format:fix`. 2026-01-31 21:13:13 +09:00
feat: expand queue modes and followup backlog 2026-01-03 04:26:36 +01:00			- All surfaces → `collect`
feat(queue): add queue modes and discord gating 2025-12-26 13:35:44 +01:00
docs: complete channels rename sweep 2026-01-13 07:15:57 +00:00			Configure globally or per channel via `messages.queue`:
feat(queue): add queue modes and discord gating 2025-12-26 13:35:44 +01:00
			```json5
			`{`
feat: wire multi-agent config and routing Co-authored-by: Mark Pors <1078320+pors@users.noreply.github.com> 2026-01-09 12:44:23 +00:00			`messages: {`
feat(queue): add queue modes and discord gating 2025-12-26 13:35:44 +01:00			`queue: {`
feat: expand queue modes and followup backlog 2026-01-03 04:26:36 +01:00			`mode: "collect",`
			`debounceMs: 1000,`
			`cap: 20,`
			`drop: "summarize",`
chore: Run `pnpm format:fix`. 2026-01-31 21:13:13 +09:00			`byChannel: { discord: "collect" },`
			`},`
			`},`
feat(queue): add queue modes and discord gating 2025-12-26 13:35:44 +01:00			`}`
			```

feat: expand queue modes and followup backlog 2026-01-03 04:26:36 +01:00			`## Queue options`
chore: Run `pnpm format:fix`. 2026-01-31 21:13:13 +09:00
feat: expand queue modes and followup backlog 2026-01-03 04:26:36 +01:00			Options apply to `followup`, `collect`, and `steer-backlog` (and to `steer` when it falls back to followup):
chore: Run `pnpm format:fix`. 2026-01-31 21:13:13 +09:00
feat: expand queue modes and followup backlog 2026-01-03 04:26:36 +01:00			- `debounceMs`: wait for quiet before starting a followup turn (prevents “continue, continue”).
			- `cap`: max queued messages per session.
			- `drop`: overflow policy (`old`, `new`, `summarize`).

			`Summarize keeps a short bullet list of dropped messages and injects it as a synthetic followup prompt.`
			Defaults: `debounceMs: 1000`, `cap: 20`, `drop: summarize`.

feat(queue): add queue modes and discord gating 2025-12-26 13:35:44 +01:00			`## Per-session overrides`
chore: Run `pnpm format:fix`. 2026-01-31 21:13:13 +09:00
feat(commands): unify chat commands (#275) * Chat commands: registry, access groups, Carbon * Chat commands: clear native commands on disable * fix(commands): align command surface typing * docs(changelog): note commands registry (PR #275) --------- Co-authored-by: Peter Steinberger <steipete@gmail.com> 2026-01-06 14:17:56 -06:00			- Send `/queue <mode>` as a standalone command to store the mode for the current session.
feat: expand queue modes and followup backlog 2026-01-03 04:26:36 +01:00			- Options can be combined: `/queue collect debounce:2s cap:25 drop:summarize`
feat(queue): add reset/default directive 2025-12-26 14:24:53 +01:00			- `/queue default` or `/queue reset` clears the session override.
feat(queue): add queue modes and discord gating 2025-12-26 13:35:44 +01:00
feat: serialize command auto-replies with queue 2025-11-25 04:40:49 +01:00			`## Scope and guarantees`
chore: Run `pnpm format:fix`. 2026-01-31 21:13:13 +09:00
fix: hard-abort clears queues on /stop 2026-01-16 21:15:25 +00:00			`- Applies to auto-reply agent runs across all inbound channels that use the gateway reply pipeline (WhatsApp web, Telegram, Slack, Discord, Signal, iMessage, webchat, etc.).`
feat: wire multi-agent config and routing Co-authored-by: Mark Pors <1078320+pors@users.noreply.github.com> 2026-01-09 12:44:23 +00:00			- Default lane (`main`) is process-wide for inbound + main heartbeats; set `agents.defaults.maxConcurrent` to allow multiple sessions in parallel.
fix: hard-abort clears queues on /stop 2026-01-16 21:15:25 +00:00			- Additional lanes may exist (e.g. `cron`, `subagent`) so background jobs can run in parallel without blocking inbound replies.
fix(agent): serialize runs per session 2025-12-25 23:50:52 +01:00			`- Per-session lanes guarantee that only one agent run touches a given session at a time.`
feat: serialize command auto-replies with queue 2025-11-25 04:40:49 +01:00			`- No external dependencies or background worker threads; pure TypeScript + promises.`

			`## Troubleshooting`
chore: Run `pnpm format:fix`. 2026-01-31 21:13:13 +09:00
feat: serialize command auto-replies with queue 2025-11-25 04:40:49 +01:00			`- If commands seem stuck, enable verbose logs and look for “queued for …ms” lines to confirm the queue is draining.`
docs: refresh and simplify docs 2026-01-08 23:06:56 +01:00			`- If you need queue depth, enable verbose logs and watch for queue timing lines.`