“Just use Postgres” has been floating around as advice for years, but most of the articles arguing for it are opinion pieces. I wanted hard numbers.
So I created a benchmark suite that pits vanilla PostgreSQL against a feature-enhanced PostgreSQL instance — measuring caching, message queue, full-text search, and pub/sub under controlled conditions.
In this article, you’ll learn how to use PostgreSQL’s built-in features for caching, job queues, full-text search, and pub/sub. You’ll see actual benchmark results (latency percentiles, throughput, and error rates) comparing plain PostgreSQL patterns against optimized ones, and understand where PostgreSQL’s limitations lie so you can decide if you really need this extra service in your stack.
Table of Contents
Conditions
To follow or replicate benchmarks, you will need to:
Docker and Docker Compose
Node.js 20+ (for ExpressTypeScript API layer)
k6 For load testing
Basic familiarity with SQL and PostgreSQL
There is a complete benchmark project. Open source on GitHub – You can clone it and run each test yourself.
Setup
The benchmark uses two identical PostgreSQL 17 instances running in Docker containers, each with fixed resource constraints (2 CPUs, 2 GB RAM). Both share the same ExpressTypeScript API layer – the only difference is which PostgreSQL features are enabled.
┌─────────┐ ┌──────────────────┐ ┌─────────────────┐
│ k6 │────>│ Express API │────>│ PG Baseline │
│ (load │ │ (TypeScript) │ │ (vanilla PG17) │
│ test) │────>│ Port 3001/3002 │────>│ PG Modded │
└─────────┘ └──────────────────┘ │ (features on) │
└─────────────────┘
The baseline example uses a naive approach (regular tables, ILIKE search, polling). The modified example uses PostgreSQL’s built-in features (unlogged tables, tsvector With GIN indexes, LISTEN/NOTIFYpartial indexes). Same hardware, same API code, same data. Only the database properties differ.
Both examples share this tune. postgresql.conf:
# Memory allocation
shared_buffers = 512MB # 25% of available RAM
effective_cache_size = 1536MB # 75% of RAM — helps the query planner
work_mem = 16MB # per-sort/hash operation memory
# SSD-optimized planner settings
random_page_cost = 1.1 # default 4.0 assumes spinning disks
effective_io_concurrency = 200 # allow parallel I/O on SSDs
These settings are important. The defaults assume spinning disks from the early 2000s. The arrangement random_page_cost = 1.1 Tells the query planner that random reads are as fast as sequential reads on SSDs, which encourages the use of indexes over sequential scans.
Benchmark 1: Caching with non-logged tables
Idea: Use unlogged tables as in-database caches. Unlogged tables leave PostgreSQL’s write-add log (WAL) – the mechanism that guarantees durability. Since cache data is transient in nature, losing it on a crash is acceptable, and omitting the WAL removes the biggest write bottleneck.
-- Modded: UNLOGGED table for cache entries
CREATE UNLOGGED TABLE cache_entries (
key TEXT PRIMARY KEY,
value JSONB NOT NULL,
expires_at TIMESTAMPTZ
);
-- Baseline: same schema, but a regular (logged) table
CREATE TABLE cache_entries (
key TEXT PRIMARY KEY,
value JSONB NOT NULL,
expires_at TIMESTAMPTZ
);
Results (200 virtual users)
| Mode | p50 | p95 | Average | Requests |
|---|---|---|---|---|
| Baseline (regular table) | 1.87ms | 6.00ms | 2.50ms | 1,754/s |
| modded (unlogged table) | 1.71ms | 5.24ms | 2.17ms | 1,760/s |
Consistent 13% improvement across all percentiles. Not dramatic, but free — you change a keyword to yours. CREATE TABLE statement
Under Stress (1,000 virtual users, no sleep)
| Mode | p50 | p95 | Requests | Total applications. |
|---|---|---|---|---|
| Baseline | 83.38ms | 143.23ms | 7,663/s | 728,021 |
| Modded | 77.69ms | 126.39ms | 8,062/s | 765,934 |
Relative improvement remains locked at 12-13% regardless of load level. The benefit of UNLOGGED is optimization per write – it saves the same amount of I/O whether you’re doing 100 or 10,000 writes per second. The modified example served 37,000 more requests in the same time window.
The decision
Unlogged tables are no match for Redis for sub-millisecond hotpath caching (real-time bids, gaming leaderboards). But for web applications where the difference between 2ms and 5ms is invisible to users, they eliminate reliance on the entire infrastructure for zero additional complexity.
You abandon Redis data structures (ordered sets, HyperLogLog, streams). If you need them, a dedicated cache is still the right call.
Benchmark 2: Job queues with SKIP LOCK
Idea: Use PostgreSQL as a job queue. SELECT ... FOR UPDATE SKIP LOCKED. Multiple workers poll the same table, and SKIP LOCKED Ensures that each worker gets a different queue — no duplicates, no conflicts.
-- Queue table with a partial index on pending jobs only
CREATE TABLE job_queue (
id SERIAL PRIMARY KEY,
payload JSONB NOT NULL,
status TEXT NOT NULL DEFAULT 'pending',
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- Partial index: only indexes pending jobs
-- As jobs complete, they leave the index — it stays small forever
CREATE INDEX idx_pending_jobs ON job_queue (created_at)
WHERE status="pending";
Deco Pattern:
-- Atomic dequeue: select + update in one statement
UPDATE job_queue SET status="processing"
WHERE id = (
SELECT id FROM job_queue
WHERE status="pending"
ORDER BY created_at
LIMIT 1
FOR UPDATE SKIP LOCKED -- skip rows locked by other workers
) RETURNING *;
how SKIP LOCKED Work: Worker A locks row 1. Worker B tries row 1, sees the lock, releases it, and takes row 2 instead. No blocking, no duplicates. If a worker crashes, the transaction is rolled back and the queue becomes available again.
Results (100 producers + 50 consumers)
| Mode | p50 | p95 | Average | Requests |
|---|---|---|---|---|
| Baseline (full index) | 1.90ms | 5.01ms | 2.30ms | 1,053/s |
| modded (partial index) | 1.81ms | 5.28ms | 2.29ms | 1,052/s |
They are almost identical. A partial index does not show its value in the 60-second benchmark because the table does not accumulate enough complete rows to account for the index size difference. In a production system with millions of full jobs, a partial index keeps the index to kilobytes while a full index grows to gigabytes.
The decision
SKIP LOCKED Ready for production for work queues. Libraries like PG Boss (Node.js) and the river (go) build on this exact pattern.
You abandon exchange/routing patterns (fan-out, topic-based routing) and user groups with message replay. If you need them, a dedicated message broker is still the right tool. For a simple “execute this task once” workload, PostgreSQL handles it.
Benchmark 3: Full-text search with tsvector
Idea: Use PostgreSQL’s built-in full-text search instead of a separate search service. Oh tsvector The column stores pre-processed search tokens, and a GIN (Generalized Inverted Index) enables fast searches using the same inverted index concept that powers Elasticsearch.
-- Search-optimized article table
CREATE TABLE articles (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
body TEXT NOT NULL,
search_vector tsvector -- pre-computed search tokens
);
-- GIN index for full-text search
CREATE INDEX idx_search ON articles USING GIN (search_vector);
-- Auto-update search_vector on insert/update
CREATE OR REPLACE FUNCTION update_search_vector() RETURNS trigger AS $$
BEGIN
NEW.search_vector := to_tsvector('english',
COALESCE(NEW.title, '') || ' ' || COALESCE(NEW.body, ''));
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg_search
BEFORE INSERT OR UPDATE ON articles
FOR EACH ROW EXECUTE FUNCTION update_search_vector();
Uses a baseline. ILIKE With a leading wildcard — the approach most developers arrive at first:
-- Baseline: sequential scan on every query
SELECT * FROM articles
WHERE title ILIKE '%postgresql%' OR body ILIKE '%postgresql%';
-- Modded: GIN index lookup with relevance ranking
SELECT id, title,
ts_rank(search_vector, plainto_tsquery('english', 'postgresql')) AS rank
FROM articles
WHERE search_vector @@ plainto_tsquery('english', 'postgresql')
ORDER BY rank DESC LIMIT 20;
Results (500 virtual users)
| Mode | p50 | p95 | Average | Requests |
|---|---|---|---|---|
| Baseline (ILIKE) | 1.96ms | 101.83ms | 25.22 mm | 561/s |
| mod(tsvector + GIN) | 2.76ms | 10.39ms | 3.76ms | 675/s |
This is a standout result. Baseline’s p95 101ms vs modded’s 10ms is a 10x improvement.
Why baseline’s p50 (1.96ms) is slightly better than modded (2.76ms): Simple ILIKE Queries on small result sets can be faster when the data is fit. shared_buffers. But as the load increases and the buffer cache is contended for, sequential scans slow down dramatically. The GIN index remains stable.
Under stress (500 virtual users, no sleep)
| Mode | p50 | p95 | Requests | Total applications. |
|---|---|---|---|---|
| Baseline (ILIKE) | 599ms | 1,000ms | 558/s | 50,212 |
| modded (tsvector) | 209 ms | 396 ms | 1,441/s | 129,679 |
ILIKE collapses to 1-second p95 latencies. Each query forces a sequential scan of all 10,000 subjects, blocking shared buffers and starving concurrent queries. The tsvector approach delivers 2.6x more requests in the same time window because GIN index lookup is O(log n) regardless of concurrency.
The decision
This is the strongest argument in the entire benchmark. Zero extension is required to correct — to_tsvector(), plainto_tsquery()and CREATE INDEX USING GIN All are built in basic PostgreSQL. If you are doing WHERE column ILIKE '%term%' On any table with more than a few thousand rows, you’re giving up a lot of performance on the table.
You give up distributed search in shards, complex parsers for CJK languages, and sum/first search pipelines. For a product search bar, blog search, or an internal tool — PostgreSQL is enough.
Benchmark 4: Pub/Sub with LISTEN/NOTIFY
Idea: Use PostgreSQL natively. LISTEN/NOTIFY For pub/sub messaging, triggered automatically on INSERT by a database trigger.
-- Trigger that fires pg_notify on every new message
CREATE OR REPLACE FUNCTION notify_message() RETURNS trigger AS $$
BEGIN
PERFORM pg_notify(NEW.channel, NEW.payload::text);
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg_notify
AFTER INSERT ON messages
FOR EACH ROW EXECUTE FUNCTION notify_message();
Results (200 virtual users)
| Mode | p50 | p95 | Average | Requests |
|---|---|---|---|---|
| Baseline (pool based) | 1.99ms | 6.04ms | 2.84ms | 1,116/s |
| Edited (listen/report) | 1.65ms | 4.80ms | 2.13ms | 1,131/s |
Here we have a 20% improvement over p95. A trigger-based approach does more work per INSERT (INSERT + NOTIFY), but fewer round trips and better connection reuse patterns offset the overhead.
The decision
LISTEN/NOTIFY Works for realtime features where you would otherwise reach out to Redis pub/sub. The main limitation is the payload size (8,000 bytes maximum) and the need for dedicated connections (not compatible with PG Bouncer in transaction mode).
Shared Workload: An Honest Exam
Individual standards are flattering. Original question: Can a PostgreSQL instance handle caching, queuing, lookup and pub/sub simultaneously without any degradation?
Results (all four workloads running simultaneously)
| Mode | p50 | p95 | Average | Requests |
|---|---|---|---|---|
| Baseline | 1.65ms | 5.24ms | 2.17ms | 1,424/s |
| Modded | 1.86ms | 6.05ms | 2.47ms | 1,417/s |
Under combined load, the baseline performs marginally better than the modified setup. Modded PostgreSQL does more work per operation — maintaining GIN indexes, firing triggers, running pg_cron In the background when all these features are enabled simultaneously, the overhead is measured at: about 15% higher p95 latency.
But both setups stay comfortably under 10ms at p95. For most web applications, this is much better.
What did I learn?
After running all these benchmarks, I’d tell a team here whether to “only use Postgres”:
To search the full text: Switching from
ILIKETotsvectorThere is a 10x improvement with GIN index requiring zero extensions. This is the single highest ROI change in the entire PostgreSQL ecosystem, and most developers don’t even know it exists.For work queues do this:
SKIP LOCKEDis ready for production and eliminates simple “execute this task” workloads to RabbitMQ. Use a library such as pg-boss or River instead of rolling it yourself.For caching consider: Unlogged tables give a consistent 13% improvement over regular tables. If sub-millisecond latency isn’t a strict requirement (and it isn’t for most web apps), you can skip Redis entirely.
Be honest about overhead: Running all four roles together causes about 15% delay compared to running each role individually. Whether that depends on your procrastination budget.
Know where to stop: PostgreSQL won’t match Redis for sub-millisecond caching, Kafka for millions of messages per second, or Elasticsearch for distributed multi-node search with complex parsers. The line is at extreme throughput or extreme skill.
The honest conclusion is not that “PostgreSQL does everything.” Here it is: For most applications, a well-configured PostgreSQL instance handles 80% of what you would otherwise need three to five additional services for. That’s less infrastructure to deploy, monitor, and maintain — and less stuff to break at 3 in the morning.
Enterprise-scale applications that process millions of messages per second, serve sub-millisecond cache hits to millions of concurrent users, or run distributed searches across terabytes of documents will still require specialized tools. Those tools exist for a reason, and the operational cost of running them at this scale is justified by your return performance.
But most of us aren’t building on that scale — and probably never need to. Starting with PostgreSQL for these roles means you ship faster with fewer moving parts. If and when you outgrow what PostgreSQL can handle, your benchmarks will tell you exactly which role needs to be extracted into a dedicated service. This is a much better position than starting out with five services on day one because you assumed you would need them.
gave Benchmark Project It’s open source if you want to reproduce these results or adapt the tests for your own workload.
You can find more of my writing here. site.aaronhsyong.com.