Architecture and operations Performance and tuning postgresql
David Sterling  

PostgreSQL Table Bloat Management: VACUUM FULL, pg_repack, and the Coming REPACK CONCURRENTLY

Table bloat is one of PostgreSQL’s most persistent operational challenges. It’s a natural byproduct of how PostgreSQL handles updates and deletes — old row versions (dead tuples) pile up inside heap files, wasting storage, degrading index efficiency, and slowing down queries. Left unchecked, bloat can double or triple the physical size of your busiest tables.

The good news: 2026 brings better tooling than ever for tackling this problem — and PostgreSQL 19 promises to solve it natively at last.

What Causes Table Bloat?

PostgreSQL uses MVCC (Multi-Version Concurrency Control) to handle concurrent reads and writes without locking. When you UPDATE a row, PostgreSQL doesn’t modify it in place — it marks the old version as dead and writes a new version. When you DELETE a row, the old version lingers until VACUUM can clean it up. Under high write load, these dead tuples accumulate faster than autovacuum can reclaim them, and bloat grows.

Step 1: Identify Bloated Tables

Before you act, measure. This query pulls the tables with the highest dead tuple counts from pg_stat_user_tables:

SELECT
  schemaname,
  relname AS table_name,
  n_dead_tup,
  n_live_tup,
  round(n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0) * 100, 1) AS dead_pct,
  pg_size_pretty(pg_total_relation_size(schemaname||'.'||relname)) AS total_size,
  last_autovacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 5000
ORDER BY n_dead_tup DESC
LIMIT 20;

A dead_pct over 10–15% on a large table is a strong signal that intervention is needed.

The Three Options for Reclaiming Space

Option 1: VACUUM FULL

The built-in nuclear option. VACUUM FULL rewrites the entire table to a new heap file, completely eliminating bloat and compacting indexes. It requires no extra extensions.

VACUUM FULL ANALYZE my_bloated_table;

The catch: it acquires an exclusive lock on the table for the entire duration. On a large table, that could mean minutes or hours of downtime. Use only during scheduled maintenance windows on non-critical tables.

Option 2: pg_repack (Online, Production-Safe)

pg_repack is the industry gold standard for online table repacking. It rebuilds the table and its indexes without holding an exclusive lock for more than a brief moment at the end of the process. Your application continues serving traffic throughout.

-- Install the extension (once per database)
CREATE EXTENSION pg_repack;

-- Repack a single bloated table
pg_repack -h localhost -d mydb -t my_bloated_table

-- Repack all tables in a database
pg_repack -h localhost -d mydb --no-order

Requirements: The target table must have a primary key or a unique constraint on non-nullable columns. pg_repack uses triggers and a shadow table to track live changes during the rebuild.

Option 3: pg_squeeze (Lighter-Touch Alternative)

pg_squeeze is a newer alternative to pg_repack that uses PostgreSQL’s logical replication instead of triggers to track concurrent changes during a rebuild. This results in even less overhead on the primary and no long-held locks on system catalogs.

CREATE EXTENSION pg_squeeze;

-- Add a table to the squeeze queue
SELECT squeeze.add_table(tabschema => 'public', tabname => 'my_bloated_table');

-- Run the squeeze worker
SELECT squeeze.squeeze_table('public', 'my_bloated_table', NULL, NULL, NULL);

What’s Coming: REPACK CONCURRENTLY in PostgreSQL 19

The most exciting development in this space is coming from the PostgreSQL core team itself. At PGConf.dev 2026, Álvaro Herrera previewed the REPACK CONCURRENTLY command targeted for PostgreSQL 19. This will bring native, non-blocking table repacking directly into PostgreSQL with no external extension required.

The anticipated syntax is straightforward:

-- PostgreSQL 19 (anticipated syntax)
VACUUM REPACK CONCURRENTLY my_bloated_table;

For high-availability production teams, this is the holy grail — the benefits of VACUUM FULL‘s space reclamation without any exclusive lock downtime, baked directly into the database engine.

Decision Guide: Which Tool to Use?

ScenarioRecommended Tool
Maintenance window available, no PK requiredVACUUM FULL
Production table, needs primary keypg_repack
Production table, want minimal catalog impactpg_squeeze
PostgreSQL 19+ (anticipated)REPACK CONCURRENTLY

Pro Tip: Let PostgreSQL 18 AIO Help You First

Before reaching for any of these tools, make sure your autovacuum configuration is tuned for your workload. PostgreSQL 18’s new Async I/O subsystem makes VACUUM itself significantly faster, meaning dead tuples are cleaned up more quickly and tables are less likely to reach critical bloat levels in the first place. Start by checking your autovacuum_vacuum_scale_factor and autovacuum_vacuum_cost_delay settings for high-churn tables before scheduling a manual repack.

Leave A Comment