2025-12-09 17:51:05 +00:00
---
2026-01-16 21:15:25 +00:00
summary: "Command queue design that serializes inbound auto-reply runs"
2025-12-09 17:51:05 +00:00
read_when:
- Changing auto-reply execution or concurrency
2026-01-31 16:04:03 -05:00
title: "Command Queue"
2025-12-09 17:51:05 +00:00
---
2026-01-31 21:13:13 +09:00
2026-01-16 21:15:25 +00:00
# Command Queue (2026-01-16)
2025-11-25 04:40:49 +01:00
2026-01-16 21:15:25 +00:00
We serialize inbound auto-reply runs (all channels) through a tiny in-process queue to prevent multiple agent runs from colliding, while still allowing safe parallelism across sessions.
2025-11-25 04:40:49 +01:00
## Why
2026-01-31 21:13:13 +09:00
2026-01-16 21:15:25 +00:00
- Auto-reply runs can be expensive (LLM calls) and can collide when multiple inbound messages arrive close together.
- Serializing avoids competing for shared resources (session files, logs, CLI stdin) and reduces the chance of upstream rate limits.
2025-11-25 04:40:49 +01:00
## How it works
2026-01-31 21:13:13 +09:00
2026-01-20 10:06:42 +00:00
- A lane-aware FIFO queue drains each lane with a configurable concurrency cap (default 1 for unconfigured lanes; main defaults to 4, subagent to 8).
2025-12-25 23:50:52 +01:00
- `runEmbeddedPiAgent` enqueues by **session key ** (lane `session:<key>` ) to guarantee only one active run per session.
2026-01-09 12:44:23 +00:00
- Each session run is then queued into a **global lane ** (`main` by default) so overall parallelism is capped by `agents.defaults.maxConcurrent` .
2026-01-16 21:15:25 +00:00
- When verbose logging is enabled, queued runs emit a short notice if they waited more than ~2s before starting.
- Typing indicators still fire immediately on enqueue (when supported by the channel) so user experience is unchanged while we wait our turn.
2025-11-25 04:40:49 +01:00
2026-01-13 07:15:57 +00:00
## Queue modes (per channel)
2026-01-31 21:13:13 +09:00
2026-01-03 04:26:36 +01:00
Inbound messages can steer the current run, wait for a followup turn, or do both:
2026-01-31 21:13:13 +09:00
2026-01-03 04:26:36 +01:00
- `steer` : inject immediately into the current run (cancels pending tool calls after the next tool boundary). If not streaming, falls back to followup.
- `followup` : enqueue for the next agent turn after the current run ends.
2026-01-16 21:15:25 +00:00
- `collect` : coalesce all queued messages into a **single ** followup turn (default). If messages target different channels/threads, they drain individually to preserve routing.
2026-01-03 04:26:36 +01:00
- `steer-backlog` (aka `steer+backlog` ): steer now **and ** preserve the message for a followup turn.
- `interrupt` (legacy): abort the active run for that session, then run the newest message.
- `queue` (legacy alias): same as `steer` .
2025-12-26 13:35:44 +01:00
2026-01-03 18:44:07 +01:00
Steer-backlog means you can get a followup response after the steered run, so
streaming surfaces can look like duplicates. Prefer `collect` /`steer` if you want
one response per inbound message.
2026-01-13 07:15:57 +00:00
Send `/queue collect` as a standalone command (per-session) or set `messages.queue.byChannel.discord: "collect"` .
2026-01-03 18:44:07 +01:00
2025-12-26 13:35:44 +01:00
Defaults (when unset in config):
2026-01-31 21:13:13 +09:00
2026-01-03 04:26:36 +01:00
- All surfaces → `collect`
2025-12-26 13:35:44 +01:00
2026-01-13 07:15:57 +00:00
Configure globally or per channel via `messages.queue` :
2025-12-26 13:35:44 +01:00
```json5
{
2026-01-09 12:44:23 +00:00
messages: {
2025-12-26 13:35:44 +01:00
queue: {
2026-01-03 04:26:36 +01:00
mode: "collect",
debounceMs: 1000,
cap: 20,
drop: "summarize",
2026-01-31 21:13:13 +09:00
byChannel: { discord: "collect" },
},
},
2025-12-26 13:35:44 +01:00
}
```
2026-01-03 04:26:36 +01:00
## Queue options
2026-01-31 21:13:13 +09:00
2026-01-03 04:26:36 +01:00
Options apply to `followup` , `collect` , and `steer-backlog` (and to `steer` when it falls back to followup):
2026-01-31 21:13:13 +09:00
2026-01-03 04:26:36 +01:00
- `debounceMs` : wait for quiet before starting a followup turn (prevents “continue, continue”).
- `cap` : max queued messages per session.
- `drop` : overflow policy (`old` , `new` , `summarize` ).
Summarize keeps a short bullet list of dropped messages and injects it as a synthetic followup prompt.
Defaults: `debounceMs: 1000` , `cap: 20` , `drop: summarize` .
2025-12-26 13:35:44 +01:00
## Per-session overrides
2026-01-31 21:13:13 +09:00
2026-01-06 14:17:56 -06:00
- Send `/queue <mode>` as a standalone command to store the mode for the current session.
2026-01-03 04:26:36 +01:00
- Options can be combined: `/queue collect debounce:2s cap:25 drop:summarize`
2025-12-26 14:24:53 +01:00
- `/queue default` or `/queue reset` clears the session override.
2025-12-26 13:35:44 +01:00
2025-11-25 04:40:49 +01:00
## Scope and guarantees
2026-01-31 21:13:13 +09:00
2026-01-16 21:15:25 +00:00
- Applies to auto-reply agent runs across all inbound channels that use the gateway reply pipeline (WhatsApp web, Telegram, Slack, Discord, Signal, iMessage, webchat, etc.).
2026-01-09 12:44:23 +00:00
- Default lane (`main` ) is process-wide for inbound + main heartbeats; set `agents.defaults.maxConcurrent` to allow multiple sessions in parallel.
2026-01-16 21:15:25 +00:00
- Additional lanes may exist (e.g. `cron` , `subagent` ) so background jobs can run in parallel without blocking inbound replies.
2025-12-25 23:50:52 +01:00
- Per-session lanes guarantee that only one agent run touches a given session at a time.
2025-11-25 04:40:49 +01:00
- No external dependencies or background worker threads; pure TypeScript + promises.
## Troubleshooting
2026-01-31 21:13:13 +09:00
2025-11-25 04:40:49 +01:00
- If commands seem stuck, enable verbose logs and look for “queued for …ms” lines to confirm the queue is draining.
2026-01-08 23:06:56 +01:00
- If you need queue depth, enable verbose logs and watch for queue timing lines.