Unlock PostgreSQL Performance: 3 Unconventional Optimization Techniques
When it comes to database optimization, most developers rely on the same familiar toolkit—tweaking queries, adding indexes to columns, denormalizing data, and repeating cycles of analyzing, vacuuming, and clustering. Conventional methods work, but thinking outside the box can deliver transformative performance gains for PostgreSQL. In this article, we’ll break down three practical yet underutilized PostgreSQL optimization strategies: eliminating pointless full table scans, optimizing indexes for low-cardinality scenarios, and enforcing uniqueness with hash indexes. Each addresses real-world performance pain points with actionable solutions.
I. Eliminate Meaningless Full Table Scans with Check Constraints
In daily operations, we often add CHECK constraints to table columns to ensure data validity. However, few realize that PostgreSQL doesn’t use these constraints for query optimization by default—even if a query condition clearly violates the constraint, the database will still perform a full table scan. A simple parameter adjustment can eliminate this unnecessary performance overhead.
1.1 Scenario: A Small Mistake Leading to Full Table Scans
Suppose we create a user table to store user IDs, usernames, and subscription plans, with a CHECK constraint limiting plans to either free or pro:
CREATE TABLE users (
id INT PRIMARY KEY,
username TEXT NOT NULL,
plan TEXT NOT NULL,
CONSTRAINT plan_check CHECK (plan IN ('free', 'pro'))
);
Insert 100,000 test records and analyze the table:
INSERT INTO users
SELECT n, uuidv4(), (ARRAY['free', 'pro'])[ceil(random()*2)]
FROM generate_series(1, 100_000) AS t(n);
ANALYZE users;
An analyst wants to query users on the “Pro” plan (with a capital “P”) and writes this query:
SELECT * FROM users WHERE plan = 'Pro';
Unsurprisingly, the result returns 0 rows. But the critical issue lies in the query’s execution cost—check the execution plan:
EXPLAIN ANALYZE SELECT * FROM users WHERE plan = 'Pro';
The output shows PostgreSQL performed a full table scan (Seq Scan), iterating through all 100,000 rows and taking approximately 7.4 milliseconds:
Seq Scan on users (cost=0.00..2185.00 rows=1 width=45)
(actual time=7.406..7.407 rows=0.00 loops=1)
Filter: (plan = 'Pro'::text)
Rows Removed by Filter: 100000
Buffers: shared hit=935
Planning:
Buffers: shared hit=29 read=2
Planning Time: 4.564 ms
Execution Time: 7.436 ms
Even though the CHECK constraint explicitly limits plan to free or pro (making Pro an impossible value), why does the database still scan the entire table? The core reason: PostgreSQL doesn’t automatically verify if query conditions violate CHECK constraints by default.
1.2 Enable the constraint_exclusion Parameter to Fix the Issue
PostgreSQL provides the constraint_exclusion parameter. When enabled, the database checks if query conditions conflict with constraints during execution plan generation. If a condition is guaranteed to be false, it skips the table scan entirely.
Run the following command to enable the parameter:
SET constraint_exclusion to 'on';
Execute the same query again and check the execution plan:
EXPLAIN ANALYZE SELECT * FROM users WHERE plan = 'Pro';
The execution plan now shows:
Result (cost=0.00..0.00 rows=0 width=0) (actual time=0.000..0.001 rows=0.00 loops=1)
One-Time Filter: false
Planning:
Buffers: shared hit=5 read=4
Planning Time: 5.760 ms
Execution Time: 0.008 ms
The database recognizes the condition is impossible, returns an empty result immediately, and reduces execution time from 7.4 milliseconds to just 0.008 milliseconds—eliminating the full table scan entirely.
1.3 When Is Enabling constraint_exclusion Justified?
You might wonder: If this parameter is so effective, why isn’t it enabled by default?
PostgreSQL’s official documentation explains: constraint_exclusion defaults to partition (used only for partition pruning in partitioned tables). Enabling it for all tables adds query planning overhead—for simple queries, the time spent planning may exceed the benefits of optimization.
However, enabling constraint_exclusion = on delivers significant value in these scenarios:
-
BI/Reporting Environments: Analysts often write ad-hoc queries, making typos (like Proinstead ofpro) or invalid conditions common. Avoiding full table scans here saves substantial resources. -
Data Warehouses: Tables are extremely large, so even a single unnecessary full table scan consumes significant IO and CPU. -
Partitioned Table Scenarios: Beyond default partition pruning, CHECK constraints can further optimize cross-partition queries.
II. Optimize for Low-Cardinality Scenarios: Functional Indexes + Virtual Generated Columns
When optimizing indexes for time-based fields, developers often add B-Tree indexes directly to timestamptz columns. But this one-size-fits-all approach leads to oversized indexes and high maintenance costs. For low-cardinality needs like “daily statistics,” combining functional indexes with virtual generated columns ensures performance while drastically reducing index size.
2.1 Scenario: Daily Sales Data Queries
Create a sales table with 10 million records, storing sale IDs, transaction times, and amounts:
CREATE TABLE sale (
id INT PRIMARY KEY,
sold_at TIMESTAMPTZ NOT NULL,
charged INT NOT NULL
);
INSERT INTO sale (id, sold_at, charged)
SELECT
n AS id,
'2025-01-01 UTC'::timestamptz + (interval '5 seconds') * n AS sold_at,
ceil(random() * 100) AS charged
FROM generate_series(1, 10_000_000) AS t(n);
ANALYZE sale;
Analysts need to query daily sales totals for January 2025 with this query:
SELECT date_trunc('day', sold_at AT TIME ZONE 'UTC'), SUM(charged)
FROM sale
WHERE '2025-01-01 UTC' <= sold_at AND sold_at < '2025-02-01 UTC'
GROUP BY 1;
2.1.1 Conventional Approach: Adding a B-Tree Index
By default, this query performs a full table scan, taking approximately 627 milliseconds. To optimize, we first add a standard B-Tree index on sold_at:
CREATE INDEX sale_sold_at_ix ON sale(sold_at);
The optimized query runs in 187 milliseconds, but the index size is a whopping 214 MB (nearly half the table size):
\di+ sale_sold_at_ix
List of indexes
─[ RECORD 1 ]─┬────────────────
Schema │ public
Name │ sale_sold_at_ix
Type │ index
Owner │ haki
Table │ sale
Persistence │ permanent
Access method │ btree
Size │ 214 MB
While this improves performance, the oversized index wastes storage and increases maintenance overhead during data writes.
2.2 Optimization Idea: Index Only the Required Dimension (Date)
The analysts’ need is “daily statistics,” but our B-Tree index includes millisecond-level precision—this is classic “over-indexing.” We can create a functional index that targets only the “date” portion of the transaction time:
CREATE INDEX sale_sold_at_date_ix ON sale((date_trunc('day', sold_at AT TIME ZONE 'UTC'))::date);
This date-focused index shrinks to just 66 MB (one-third the size of the original B-Tree index):
\di+ sale_sold_at_*
List of indexes
Name │ Table │ Access method │ Size
──────────────────────┼───────┼───────────────┼────────
sale_sold_at_date_ix │ sale │ btree │ 66 MB
sale_sold_at_ix │ sale │ btree │ 214 MB
Adjust the query to match the functional index expression:
SELECT date_trunc('day', sold_at AT TIME ZONE 'UTC'), SUM(charged)
FROM sale
WHERE date_trunc('day', sold_at AT TIME ZONE 'UTC')::date BETWEEN '2025-01-01' AND '2025-01-31'
GROUP BY 1;
Execution time drops further to 145 milliseconds—faster than the original B-Tree index—with a drastically smaller index footprint.
2.3 Solving the “Consistency Problem” with Functional Indexes
A major drawback of functional indexes is that queries must exactly match the index’s expression. Even minor differences (e.g., using ::date instead of date_trunc) prevent the database from using the index:
-- This query performs a full table scan due to mismatched expressions
EXPLAIN (ANALYZE OFF, COSTS OFF)
SELECT (sold_at AT TIME ZONE 'UTC')::date, SUM(charged)
FROM sale
WHERE (sold_at AT TIME ZONE 'UTC')::date BETWEEN '2025-01-01' AND '2025-01-31'
GROUP BY 1;
Relying on “enforcing query standards” is impractical. Virtual generated columns, introduced in PostgreSQL 18, solve this problem seamlessly.
2.3.1 Create a Virtual Generated Column
Virtual generated columns appear as regular columns but dynamically compute values from an expression (no physical storage required). We use them to固化 the functional index expression:
ALTER TABLE sale ADD sold_at_date DATE
GENERATED ALWAYS AS (date_trunc('day', sold_at AT TIME ZONE 'UTC'));
2.3.2 Query Using the Virtual Column
Now, querying the virtual column ensures the expression matches the index, and the database automatically uses the functional index:
EXPLAIN (ANALYZE ON, COSTS OFF, BUFFERS OFF)
SELECT sold_at_date, SUM(charged)
FROM sale
WHERE sold_at_date BETWEEN '2025-01-01' AND '2025-01-31'
GROUP BY 1;
The execution plan confirms index usage with a runtime of ~163 milliseconds—no need for developers to memorize complex expressions.
2.4 Core Advantages of This Optimization
“
Note: As of PostgreSQL 18, direct indexing on virtual generated columns is not supported. Create a functional index first, then use the virtual column to indirectly leverage it.
III. Enforce URL Uniqueness Efficiently with Hash Indexes
When enforcing uniqueness for large strings (e.g., URLs), standard unique B-Tree indexes become oversized because they store full strings. Combining PostgreSQL’s exclusion constraints with hash indexes enforces uniqueness while drastically reducing index size.
3.1 Scenario: Avoid Duplicate URL Processing
Create a URL table to store URLs and their parsed data, requiring unique URLs to avoid redundant processing:
CREATE TABLE urls (
id INT PRIMARY KEY,
url TEXT NOT NULL,
data JSON
);
INSERT INTO urls (id, url)
SELECT n, 'https://' || uuidv4() || '.com/ ' || uuidv4() || '?p=' || uuidv4()
FROM generate_series(1, 1_000_000) AS t(n);
3.2 Conventional Solution: The Problem with Unique B-Tree Indexes
Adding a unique B-Tree index to the url column is the most straightforward approach:
CREATE UNIQUE INDEX urls_url_unique_ix ON urls(url);
While this enforces uniqueness effectively, the size issue is severe:
\dt+ urls
List of tables
─[ RECORD 1 ]─┬──────────
Schema │ public
Name │ urls
Type │ table
Owner │ haki
Persistence │ permanent
Access method │ heap
Size │ 160 MB
\di+ urls_url_unique_ix
List of indexes
─[ RECORD 1 ]─┬───────────────────
Schema │ public
Name │ urls_url_unique_ix
Type │ index
Owner │ haki
Table │ urls
Persistence │ permanent
Access method │ btree
Size │ 154 MB
The 160 MB table has a corresponding unique B-Tree index of 154 MB—B-Tree indexes store full URL strings in leaf nodes, and long, low-duplication URLs cause indexes to approach table size.
3.3 Optimization: Exclusion Constraints + Hash Indexes
PostgreSQL doesn’t support direct “unique hash indexes,” but we can achieve the same effect indirectly with exclusion constraints:
ALTER TABLE urls ADD CONSTRAINT urls_url_unique_hash EXCLUDE USING HASH (url WITH =);
This constraint prevents duplicate URLs (equivalent to a unique constraint) and is backed by a hash index, which stores URL hash values (not full URLs).
3.3.1 Verify Uniqueness Enforcement
Attempting to insert a duplicate URL triggers a constraint violation:
INSERT INTO urls (id, url) VALUES (1_000_002, 'https://hakbenita.com/postgresql-hash-index');
-- Insert succeeds
INSERT INTO urls (id, url) VALUES (1_000_003, 'https://hakbenita.com/postgresql-hash-index');
-- Error: conflicting key value violates exclusion constraint "urls_url_unique_hash"
3.3.2 Verify Index Query Performance
Hash indexes not only enforce uniqueness but also accelerate regular URL queries—outperforming B-Tree indexes:
EXPLAIN (ANALYZE ON, BUFFERS OFF, COSTS OFF)
SELECT * FROM urls WHERE url = 'https://hakibenita.com';
The query runs in just 0.022 milliseconds (vs. 0.046 ms for B-Tree indexes), with the execution plan confirming hash index usage:
Index Scan using urls_url_unique_hash on urls (actual time=0.010..0.011 rows=1.00 loops=1)
Index Cond: (url = 'https://hakibenita.com'::text)
Index Searches: 1
Planning Time: 0.178 ms
Execution Time: 0.022 ms
3.3.3 Index Size Comparison
The hash index’s size advantage is dramatic:
\di+ urls_url_*
List of indexes
Name │ Access method │ Size
─────────────────────┼───────────────┼────────
urls_url_unique_hash │ hash │ 32 MB
urls_url_unique_ix │ btree │ 154 MB
At just 32 MB, the hash index is one-fifth the size of the B-Tree index—delivering substantial storage savings.
3.4 Key Considerations for Hash Index Uniqueness
Despite their advantages, hash indexes have two critical limitations to note:
3.4.1 Cannot Be Used for Foreign Key References
PostgreSQL requires foreign keys to reference “unique constraints,” and exclusion constraints don’t qualify. Thus, you can’t create a foreign key from another table to this url column:
CREATE TABLE foo (url TEXT REFERENCES urls(url));
-- Error: there is no unique constraint matching given keys for referenced table "urls"
3.4.2 Limitations with INSERT ... ON CONFLICT
-
You can’t use ON CONFLICT (url)directly—specify the constraint name instead:-- Incorrect INSERT INTO urls (id, url) VALUES (1_000_004, 'https://hakibenita.com') ON CONFLICT (url) DO NOTHING; -- Correct INSERT INTO urls (id, url) VALUES (1_000_004, 'https://hakibenita.com') ON CONFLICT ON CONSTRAINT urls_url_unique_hash DO NOTHING; -
ON CONFLICT ... DO UPDATEis unsupported. For “upsert” functionality, use theMERGEstatement:MERGE INTO urls t USING (VALUES (1000004, 'https://hakibenita.com')) AS s(id, url) ON t.url = s.url WHEN MATCHED THEN UPDATE SET id = s.id WHEN NOT MATCHED THEN INSERT (id, url) VALUES (s.id, s.url);
IV. Frequently Asked Questions (FAQ)
Q1: Will enabling constraint_exclusion impact performance for regular queries?
A: For simple queries, enabling constraint_exclusion = on adds minor planning overhead (due to constraint validation). However, in reporting or data warehouse scenarios, this overhead is far outweighed by the savings from avoiding full table scans. For high-concurrency OLTP environments, keep the default partition setting.
Q2: Are functional indexes and virtual generated columns only for time fields?
A: No—they work for any scenario where queries only need partial field characteristics, such as:
-
Indexing the last 8 digits of phone numbers (for fuzzy searches). -
Indexing specific JSON path values in JSON fields. -
Indexing lowercase versions of string fields (for case-insensitive queries).
Q3: Are hash indexes suitable for all uniqueness scenarios?
A: No. Hash indexes excel for large, low-duplication strings (e.g., URLs, long text). For small, ordered fields (e.g., numbers, short strings), B-Tree indexes are better—they support range queries (e.g., url > 'https://a.com'), while hash indexes only support equality checks.
Q4: How to solve functional index consistency issues in PostgreSQL versions before 18 (without virtual generated columns)?
A: Use a view as a workaround:
CREATE VIEW v_sale AS
SELECT *, date_trunc('day', sold_at AT TIME ZONE 'UTC')::date AS sold_at_date
FROM sale;
Require developers to query the view instead of the base table to ensure consistent expressions. The tradeoff: you’ll need to restrict direct access to the base table to prevent index inefficiencies.
V. Conclusion
PostgreSQL optimization isn’t limited to “adding indexes and tweaking queries.” Understanding the database’s underlying mechanics and aligning optimizations with business needs balances performance and resource efficiency:
-
Use the constraint_exclusionparameter to turn CHECK constraints into query optimization tools, eliminating pointless full table scans. -
Replace full-precision B-Tree indexes with functional indexes + virtual generated columns for low-cardinality queries, slashing index size. -
Enforce uniqueness for large strings with exclusion constraints + hash indexes instead of unique B-Tree indexes, optimizing both performance and storage.
The core logic of these unconventional techniques is simple: don’t pay for features you don’t need. If you don’t require millisecond precision, don’t index it. If you don’t need to store full URLs, store hashes instead. If you don’t need to scan data that violates constraints, skip the scan entirely. By aligning indexes and queries closely with business requirements, you’ll unlock PostgreSQL’s full performance potential.

