Dispatch System

The dispatch system is the bridge between the Opengram UI and your agent. When a user sends a message, the dispatch system collects it, optionally batches it with other pending messages, and makes it available for your worker to claim and process.

Dispatch modes

Configure the mode with the server.dispatch.mode field in opengram.config.json. Three modes are available:

immediate

Each message is dispatched the moment it arrives. No batching, no debouncing. Use this when latency matters more than efficiency, or when your agent processes each message independently.

sequential

Messages are dispatched one at a time in the order they were received. The next message is not dispatched until the current one is completed. This is useful when message order is critical and your agent cannot handle concurrent inputs.

batched_sequential (default)

Messages are collected into batches using a debounce window, then dispatched sequentially. This is the default mode and works well for most use cases -- it handles rapid-fire messages from users without overwhelming your agent.

Batching is controlled by three timing parameters:

Parameter	Default	Description
`batchDebounceMs`	600	Wait this long after the last message before sealing the batch.
`typingGraceMs`	2000	Extra grace period if the user is still typing.
`maxBatchWaitMs`	30000	Maximum time to wait before forcing the batch to dispatch.

opengram.config.json

{
  "server": {
    "dispatch": {
      "mode": "batched_sequential",
      "batchDebounceMs": 600,
      "typingGraceMs": 2000,
      "maxBatchWaitMs": 30000
    }
  }
}

Input sources

A dispatch can be triggered by different input sources:

user_message -- the user sent one or more messages in the chat.
request_resolved -- the user resolved an interactive request (e.g. answered a question or submitted a form).

Your worker receives the input source in the dispatch payload so it can decide how to handle each case.

Worker claiming flow

Your agent worker pulls dispatches from Opengram using a claim-based protocol:

Claim -- call POST /api/v1/dispatch/claim to claim a single dispatch, or POST /api/v1/dispatch/claim-many to claim up to N dispatches at once. The response includes the batch payload with all messages, chat context, and an agentIdHint indicating which agent configuration applies.
Heartbeat -- while processing, call POST /api/v1/dispatch/{id}/heartbeat periodically to extend the lease. If the lease expires without a heartbeat, the dispatch is returned to the queue automatically by a background lease sweeper.
Complete or fail -- call POST /api/v1/dispatch/{id}/complete when done, or POST /api/v1/dispatch/{id}/fail if something went wrong.

During processing, your worker can send messages, stream tokens, attach files, and create interactive requests using the standard API endpoints.

A successful claim returns 200 with the batch payload. When no work is available, the endpoint long-polls for up to claimWaitMs (default: 10 seconds) before returning 204 No Content. Your worker should loop and call claim again.

For the full batch payload schema and a complete claim loop example, see Building an Agent — The batch payload.

Lease sweeper

A background process periodically checks for batches whose lease has expired (the worker stopped heartbeating or crashed). Expired leases are automatically returned to the queue so another worker can claim them. The sweeper runs every schedulerTickMs (default: 500ms).

Autoscaling

The dispatch system supports autoscaling the number of concurrent workers it expects:

Parameter	Default	Description
`execution.autoscaleEnabled`	`true`	Whether autoscaling is active.
`execution.minConcurrency`	2	Minimum concurrent workers.
`execution.maxConcurrency`	10	Maximum concurrent workers.
`execution.scaleCooldownMs`	5000	Cooldown period (ms) before scaling down.

These settings control how claim-many distributes work. When the queue is deep, Opengram signals workers to scale up. When idle, it scales back down after the cooldown period. The maximum number of batches returned by a single claim-many call is controlled by claim.claimManyLimit (default: 10, hard cap: 50).

Retry behavior

When your worker calls /fail with retryable: true, the dispatch is retried with exponential backoff:

Base delay: retryBaseMs (default: 500ms)
Maximum delay: retryMaxMs (default: 30s)
Maximum attempts: maxAttempts (default: 8)

Your worker controls whether a failure is retryable via the retryable field in the /fail request body. You can also override the next retry delay with retryDelayMs. If retryable is false, or all attempts are exhausted, the dispatch is marked as permanently failed and a system message is posted to the chat visible to the user.

Configuration reference

All dispatch settings live under server.dispatch in opengram.config.json. Every field is optional and falls back to its default.

Parameter	Default	Description
`mode`	`"batched_sequential"`	Dispatch mode: `immediate`, `sequential`, or `batched_sequential`.
`batchDebounceMs`	600	Wait after last message before sealing a batch.
`typingGraceMs`	2000	Extra grace period while the user is typing.
`maxBatchWaitMs`	30000	Maximum time before forcing a batch to dispatch.
`schedulerTickMs`	500	Polling interval for the batch scheduler and lease sweeper.
`leaseMs`	30000	Default lease duration for claimed batches.
`heartbeatIntervalMs`	5000	Recommended heartbeat interval.
`claimWaitMs`	10000	Long-poll timeout for claim requests.
`retryBaseMs`	500	Exponential backoff base delay.
`retryMaxMs`	30000	Maximum backoff delay.
`maxAttempts`	8	Maximum retry attempts before permanent failure.
`execution.autoscaleEnabled`	`true`	Whether autoscaling is active.
`execution.minConcurrency`	2	Minimum concurrent workers.
`execution.maxConcurrency`	10	Maximum concurrent workers.
`execution.scaleCooldownMs`	5000	Cooldown before scaling down.
`claim.claimManyLimit`	10	Max batches per `claim-many` call (hard cap: 50).

opengram.config.json

{
  "server": {
    "dispatch": {
      "mode": "batched_sequential",
      "batchDebounceMs": 600,
      "typingGraceMs": 2000,
      "maxBatchWaitMs": 30000,
      "schedulerTickMs": 500,
      "leaseMs": 30000,
      "heartbeatIntervalMs": 5000,
      "claimWaitMs": 10000,
      "retryBaseMs": 500,
      "retryMaxMs": 30000,
      "maxAttempts": 8,
      "execution": {
        "autoscaleEnabled": true,
        "minConcurrency": 2,
        "maxConcurrency": 10,
        "scaleCooldownMs": 5000
      },
      "claim": {
        "claimManyLimit": 10
      }
    }
  }
}

On this page