Dead letter queue (DLQ) | Glossary

Dead letter queue (DLQ) is a queue for messages / jobs that repeatedly failed processing. Instead of losing failed items after N retries — they land in DLQ for manual inspection.

Classic pattern:

Job lands in main queue
Worker tries processing
Fail? Retry with exponential backoff (1s, 2s, 4s, 8s...)
After 5-10 failed attempts → move to DLQ
DLQ has alert (Slack, email) and a UI for inspection
Engineer reviews failed jobs, fixes root cause, replays from DLQ → main queue

What ends up there in practice:

Permanent failures (404, malformed data, banned account)
Schema drift (target API changed format)
Network outages longer than retry budget
Software bugs that surfaced only in production

Without DLQ: failed jobs are silently lost. Customer calls "where is my report?" — turns out the pipeline has been failing for 3 days and nobody noticed. With DLQ: everything is visible, recoverable, auditable.

Implementation: RabbitMQ has built-in DLX (Dead Letter Exchange). AWS SQS has redrive policy. Redis Streams + manual logic. PostgreSQL with status column. See our engineering principle "Dead-letter everything".