PostgreSQL Internals
Overview
Section titled “Overview”This document describes the internal implementation of @queuert/postgres — the tables it creates, how it uses PostgreSQL-specific features for correctness and performance, and how notifications propagate between workers.
Schema
Section titled “Schema”The adapter creates its schema via migrateToLatest(). All objects live under a configurable PostgreSQL schema (default: public) with a table name prefix (default: queuert_) for namespace isolation.
Custom Enum
Section titled “Custom Enum”CREATE TYPE job_status AS ENUM ('blocked', 'pending', 'running', 'completed');PostgreSQL enums provide type safety at the database level — invalid status values are rejected by the engine rather than relying on application-level checks.
Job Table
Section titled “Job Table”The job table stores all job state:
| Column | Type | Description |
|---|---|---|
id | configurable (default: uuid) | Primary key. Type and default expression are set via idType / idDefault |
type_name | text | Job type identifier |
chain_id | same as id | Foreign key to root job — every job in a chain points to the root |
chain_type_name | text | Type name of the chain (copied from root for query efficiency) |
chain_index | integer | Position in chain (0 for root, incrementing for continuations) |
input | jsonb | Job input data |
output | jsonb | Completion output (null until completed) |
status | job_status | Current state: blocked, pending, running, or completed |
created_at | timestamptz | When the job was created |
scheduled_at | timestamptz | Earliest time the job can be acquired |
completed_at | timestamptz | When the job completed (null until completed) |
completed_by | text | Worker ID that completed the job (null for workerless) |
attempt | integer | Number of processing attempts (starts at 0) |
last_attempt_at | timestamptz | When the last attempt started |
last_attempt_error | jsonb | Error from last failed attempt |
leased_by | text | Worker ID holding the lease |
leased_until | timestamptz | Lease expiry time |
deduplication_key | text | Key for chain deduplication |
chain_trace_context | text | W3C traceparent for chain-level spans |
trace_context | text | W3C traceparent for job-level spans |
The chain_id foreign key references job(id), forming a self-referential relationship where all jobs in a chain point to the root job (chain_index = 0).
Job Blocker Table
Section titled “Job Blocker Table”The job_blocker table tracks dependencies between jobs and chains:
| Column | Type | Description |
|---|---|---|
job_id | foreign key to job(id) | The blocked job |
blocked_by_chain_id | foreign key to job(id) | Root job ID of the blocker chain |
index | integer | Position in the blockers array |
trace_context | text | PRODUCER span context for blocker resolution |
Primary key: (job_id, blocked_by_chain_id) — each job–blocker pair is unique.
Migration Table
Section titled “Migration Table”The migration table tracks applied schema migrations:
| Column | Type | Description |
|---|---|---|
name | text | Migration identifier (e.g., 20240101000000_initial_schema) |
applied_at | timestamptz | When the migration was applied |
Indexes
Section titled “Indexes”All indexes use partial conditions (WHERE clauses) to minimize size and target specific query patterns.
Job Acquisition
Section titled “Job Acquisition”CREATE INDEX job_acquisition_idx ON job (type_name, scheduled_at) WHERE status = 'pending'Speeds up acquireJob — only pending jobs participate in the index.
Chain Uniqueness
Section titled “Chain Uniqueness”CREATE UNIQUE INDEX job_chain_index_idx ON job (chain_id, chain_index)Guarantees each position in a chain has exactly one job.
Deduplication
Section titled “Deduplication”CREATE INDEX job_deduplication_idx ON job (deduplication_key, created_at DESC) WHERE deduplication_key IS NOT NULL AND chain_index = 0Fast lookup for existing chains with the same deduplication key. Only root jobs (chain_index = 0) are indexed.
Lease Expiry
Section titled “Lease Expiry”CREATE INDEX job_expired_lease_idx ON job (type_name, leased_until) WHERE status = 'running' AND leased_until IS NOT NULLThe reaper uses this to find timed-out jobs efficiently.
Blocker Lookups
Section titled “Blocker Lookups”CREATE INDEX job_blocker_chain_idx ON job_blocker (blocked_by_chain_id)Fast reverse lookup — given a completed chain, find all jobs it was blocking.
Listing Indexes
Section titled “Listing Indexes”Five indexes support the listing and filtering queries used by the dashboard and listJobs/listJobChains APIs:
CREATE INDEX job_chain_listing_idx ON job (created_at DESC) WHERE chain_index = 0CREATE INDEX job_listing_idx ON job (created_at DESC)CREATE INDEX job_listing_status_idx ON job (status, created_at DESC)CREATE INDEX job_listing_type_name_idx ON job (type_name, created_at DESC)CREATE INDEX job_chain_listing_type_name_idx ON job (type_name, created_at DESC) WHERE chain_index = 0Locking
Section titled “Locking”The adapter uses row-level locking exclusively — no advisory locks.
FOR UPDATE SKIP LOCKED — Job Acquisition
Section titled “FOR UPDATE SKIP LOCKED — Job Acquisition”The core acquisition query atomically selects and claims a job:
WITH acquired_job AS ( SELECT id FROM job WHERE type_name IN (...) AND status = 'pending' AND scheduled_at <= now() ORDER BY scheduled_at ASC LIMIT 1 FOR UPDATE SKIP LOCKED)UPDATE job SET status = 'running', attempt = attempt + 1WHERE id = (SELECT id FROM acquired_job)RETURNING *, EXISTS(...) AS has_moreFOR UPDATE SKIP LOCKED is the key mechanism:
- FOR UPDATE locks the selected row, preventing other transactions from modifying it
- SKIP LOCKED skips rows already locked by other transactions instead of waiting
This means multiple workers can acquire jobs concurrently without blocking each other — each worker atomically claims the next available job. The has_more flag in RETURNING tells the worker whether to immediately attempt another acquisition or wait for a notification.
The same pattern is used for lease reaping (reapExpiredJobLease), where expired leases are reclaimed without blocking active transactions.
FOR UPDATE — Mutations
Section titled “FOR UPDATE — Mutations”Operations that modify a specific job (e.g., completing a job, renewing a lease) use FOR UPDATE without SKIP LOCKED:
SELECT * FROM job WHERE id = $1 FOR UPDATEThis blocks until the row is available, ensuring the operation sees the latest state. Used in getJobForUpdate and getLatestChainJobForUpdate.
Deadlock Prevention in Deletion
Section titled “Deadlock Prevention in Deletion”When deleting connected chains, the adapter locks rows in physical (ctid) order:
SELECT id FROM job WHERE chain_id = ANY($1) ORDER BY ctid FOR UPDATEOrdering by ctid ensures all concurrent deletions acquire locks in the same physical order, preventing deadlock cycles that would occur with arbitrary ordering.
Transaction Management
Section titled “Transaction Management”The adapter uses explicit BEGIN/COMMIT/ROLLBACK with savepoints for nested operations:
SAVEPOINT queuert_sp-- user callback executes hereRELEASE SAVEPOINT queuert_sp-- or on error: ROLLBACK TO SAVEPOINT queuert_spSavepoints enable partial rollback within a transaction — if a user callback fails, the savepoint rolls back its effects without aborting the entire transaction.
Notifications (LISTEN/NOTIFY)
Section titled “Notifications (LISTEN/NOTIFY)”PostgreSQL’s built-in LISTEN/NOTIFY mechanism provides low-latency event delivery between processes without polling.
Channels
Section titled “Channels”The adapter uses three notification channels (configurable prefix, default queuert):
| Channel | Published When | Payload | Purpose |
|---|---|---|---|
{prefix}_sched | Jobs become pending | Job type name | Wake idle workers |
{prefix}_chainc | Chain completes | Chain ID | Wake clients awaiting chain results |
{prefix}_owls | Lease expires and job is reaped | Job ID | Notify workers of ownership loss |
Publishing
Section titled “Publishing”Notifications are published via pg_notify():
SELECT pg_notify($1, $2)When called inside withTransaction, the notification is delivered after the transaction commits — PostgreSQL guarantees this atomicity.
Subscribing
Section titled “Subscribing”Each channel subscription maintains a dedicated connection that issues LISTEN and stays open, receiving events via the PostgreSQL protocol’s asynchronous notification mechanism. The adapter uses a shared listener pattern that multiplexes multiple callbacks on a single subscription, lazily starting when the first subscriber registers and stopping when the last unsubscribes.
No Hint Optimization
Section titled “No Hint Optimization”Unlike Redis and NATS, the PostgreSQL notify adapter does not implement hint-based thundering herd optimization. All listening workers query the database on each notification. This is acceptable because FOR UPDATE SKIP LOCKED ensures only one worker acquires each job — redundant queries are cheap, not harmful.
CTE Patterns
Section titled “CTE Patterns”The adapter uses CTEs (Common Table Expressions) extensively to perform multi-step operations in a single round-trip:
- Job creation: Deduplication check + batch INSERT in one query
- Blocker management: INSERT blockers + UPDATE job status from pending to blocked
- Unblocking: UPDATE jobs from blocked to pending when all their blockers have completed (blocker rows are retained to propagate trace context into the unblocked job)
- Chain deletion: Recursive CTE to find connected chains + cascading DELETE
- Connected chain discovery: Recursive CTE traversing blocker relationships in both directions
All writeable CTEs use RETURNING to propagate results between steps without additional round-trips.
Vacuum Tuning
Section titled “Vacuum Tuning”The adapter configures aggressive autovacuum and storage settings on the job tables via the vacuum_tuning migration:
Fillfactor
Section titled “Fillfactor”ALTER TABLE job SET (fillfactor = 75);Fillfactor reserves 25% free space per heap page. Jobs go through multiple in-place status updates (pending → running → completed, plus lease renewals), and PostgreSQL can perform these as HOT (Heap-Only Tuple) updates when free space is available in the same page. HOT updates avoid creating new index entries, reducing both index bloat and vacuum workload.
The job_blocker table does not set a fillfactor because blockers are inserted and deleted without intermediate updates.
Autovacuum
Section titled “Autovacuum”ALTER TABLE job SET ( autovacuum_vacuum_scale_factor = 0.02, autovacuum_analyze_scale_factor = 0.02, autovacuum_vacuum_cost_delay = 0);| Setting | Default | Configured | Effect |
|---|---|---|---|
autovacuum_vacuum_scale_factor | 0.2 (20%) | 0.02 (2%) | Triggers vacuum after 2% dead tuples instead of 20% |
autovacuum_analyze_scale_factor | 0.1 (10%) | 0.02 (2%) | Re-analyzes planner statistics after 2% row changes |
autovacuum_vacuum_cost_delay | 2ms | 0 | Removes I/O throttling — vacuum runs at full speed |
These settings are applied per-table (not server-wide) to the job table. The job_blocker table sets only autovacuum_vacuum_cost_delay = 0 since blockers are inserted and deleted without intermediate updates, producing less churn than the job table.
On-Demand Vacuum
Section titled “On-Demand Vacuum”The adapter also exposes a vacuum() method that runs VACUUM on both job tables:
await stateAdapter.vacuum();PostgreSQL’s VACUUM (without FULL) does not block reads or writes — it reclaims dead tuples while the tables remain accessible. This complements autovacuum for cases where immediate reclamation is desired (e.g., after a large batch deletion in the cleanup job).
Listing Queries and Vacuum
Section titled “Listing Queries and Vacuum”listJobChains joins each root row with the last job in the chain via a lateral subquery. The status filter applies to the joined last job and cannot use an index — only typeName and date range filters narrow the scan before the join. Without these filters, every root row is scanned and joined.
Listing queries hold an MVCC snapshot for their duration. On tables with frequent writes, unfiltered scans hold snapshots longer, preventing autovacuum from reclaiming dead tuples and causing table bloat over time. The aggressive autovacuum settings above help mitigate this by reclaiming dead tuples more frequently between listing scans.
listJobs uses straightforward indexed scans without a join and is efficient at any scale.
See Also
Section titled “See Also”- Adapter Architecture — Provider/adapter design philosophy
- PostgreSQL Reference — API documentation
- SQLite Internals — SQLite-specific implementation