PostgreSQL: The Complete Guide for 2026
PostgreSQL is the world's most advanced open-source relational database. It has been in active development for over 35 years and has earned a reputation for reliability, data integrity, and a feature set that rivals or surpasses expensive commercial databases. Whether you are building a startup's first API, managing petabytes of analytics data, or replacing a legacy Oracle installation, PostgreSQL handles it all.
This guide covers everything you need to work with PostgreSQL effectively: from installation and basic table design to advanced features like JSONB, full-text search, window functions, partitioning, replication, and performance tuning. Every concept includes practical code examples you can run immediately. By the end, you will have the knowledge to design, query, optimize, and operate PostgreSQL databases in production.
Table of Contents
- Installation and Setup
- PostgreSQL Data Types
- CREATE, ALTER, and DROP Tables
- SELECT with JOINs
- Indexes: B-tree, GIN, GiST, and More
- CTEs and Recursive Queries
- Window Functions
- JSONB Operations
- Full-Text Search
- Transactions and ACID
- Stored Procedures and Functions
- Triggers
- Performance Tuning with EXPLAIN ANALYZE
- Table Partitioning
- Replication
- Backup and Restore
- Security and Roles
- PostgreSQL vs MySQL
- Frequently Asked Questions
Installation and Setup
PostgreSQL runs on Linux, macOS, and Windows. The installation is straightforward on every platform. After installation, you get the psql command-line client, which is the primary way to interact with PostgreSQL.
Installing PostgreSQL
# Ubuntu / Debian
sudo apt update
sudo apt install postgresql postgresql-contrib
# macOS (Homebrew)
brew install postgresql@16
brew services start postgresql@16
# Fedora / RHEL
sudo dnf install postgresql-server postgresql-contrib
sudo postgresql-setup --initdb
sudo systemctl start postgresql
sudo systemctl enable postgresql
# Docker (quick start)
docker run --name pg -e POSTGRES_PASSWORD=secret -p 5432:5432 -d postgres:16
# Verify the installation
psql --version
# psql (PostgreSQL) 16.x
First Connection and Basic Setup
# Connect as the default postgres superuser
sudo -u postgres psql
# Inside psql:
-- Create a new database
CREATE DATABASE myapp;
-- Create a new user with a password
CREATE USER myuser WITH PASSWORD 'secure_password';
-- Grant all privileges on the database
GRANT ALL PRIVILEGES ON DATABASE myapp TO myuser;
-- Connect to the new database
\c myapp
-- Useful psql commands:
\l -- List all databases
\dt -- List tables in current database
\d users -- Describe a table's structure
\di -- List indexes
\df -- List functions
\q -- Quit psql
\timing -- Toggle query execution time display
Configuration Essentials
# Key settings in postgresql.conf:
# Memory (adjust based on your server's RAM)
shared_buffers = '256MB' # 25% of total RAM for dedicated servers
effective_cache_size = '768MB' # 50-75% of total RAM
work_mem = '16MB' # Per-sort/hash operation (careful with high values)
maintenance_work_mem = '128MB' # For VACUUM, CREATE INDEX
# Write-Ahead Log (WAL)
wal_level = 'replica' # Required for replication and point-in-time recovery
max_wal_senders = 3 # Number of replication connections
wal_keep_size = '1GB' # WAL retention for replication lag
# Connections
max_connections = 100 # Use connection pooling for more
listen_addresses = 'localhost' # Change to '*' for remote access
# Logging
log_min_duration_statement = 1000 # Log queries taking more than 1 second (ms)
log_statement = 'ddl' # Log DDL statements (CREATE, ALTER, DROP)
PostgreSQL Data Types
PostgreSQL has one of the richest type systems of any database. Beyond the standard SQL types, it offers arrays, JSONB, range types, geometric types, network addresses, and the ability to define your own custom types.
-- Standard numeric types
SMALLINT -- 2 bytes, -32768 to +32767
INTEGER -- 4 bytes, -2.1 billion to +2.1 billion
BIGINT -- 8 bytes, -9.2 quintillion to +9.2 quintillion
DECIMAL(10, 2) -- Exact numeric, 10 digits total, 2 after decimal
NUMERIC(10, 2) -- Same as DECIMAL
REAL -- 4-byte floating point
DOUBLE PRECISION -- 8-byte floating point
SERIAL -- Auto-incrementing 4-byte integer
BIGSERIAL -- Auto-incrementing 8-byte integer
-- Text types
CHAR(n) -- Fixed-length, space-padded
VARCHAR(n) -- Variable-length with limit
TEXT -- Variable-length, unlimited (preferred in PostgreSQL)
-- Date and time
DATE -- Date only: '2026-02-11'
TIME -- Time only: '14:30:00'
TIMESTAMP -- Date and time without timezone
TIMESTAMPTZ -- Date and time WITH timezone (recommended)
INTERVAL -- Duration: '3 days 4 hours'
-- Boolean
BOOLEAN -- TRUE, FALSE, or NULL
-- Binary
BYTEA -- Variable-length binary string
-- PostgreSQL-specific types
UUID -- 128-bit universally unique identifier
JSONB -- Binary JSON (indexable, efficient)
JSON -- Text JSON (stored as-is, not indexable)
ARRAY -- Arrays of any type: INTEGER[], TEXT[]
HSTORE -- Key-value pairs (extension)
INET / CIDR -- IPv4/IPv6 network addresses
MACADDR -- MAC addresses
TSQUERY / TSVECTOR -- Full-text search types
POINT / LINE / POLYGON -- Geometric types
INT4RANGE / TSRANGE -- Range types
ENUM -- Custom enumeration types
Arrays
-- Create a table with an array column
CREATE TABLE articles (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
tags TEXT[] DEFAULT '{}'
);
-- Insert with arrays
INSERT INTO articles (title, tags)
VALUES ('PostgreSQL Tips', ARRAY['database', 'postgresql', 'tutorial']);
INSERT INTO articles (title, tags)
VALUES ('Docker Guide', '{docker, containers, devops}');
-- Query arrays
SELECT * FROM articles WHERE 'postgresql' = ANY(tags);
SELECT * FROM articles WHERE tags @> ARRAY['database']; -- Contains
SELECT * FROM articles WHERE tags && ARRAY['docker', 'postgresql']; -- Overlaps
-- Array functions
SELECT title, array_length(tags, 1) AS tag_count FROM articles;
SELECT title, unnest(tags) AS tag FROM articles; -- Expand array to rows
Enums
-- Create a custom enum type
CREATE TYPE order_status AS ENUM ('pending', 'processing', 'shipped', 'delivered', 'cancelled');
-- Use in a table
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_id INTEGER REFERENCES users(id),
status order_status DEFAULT 'pending',
amount DECIMAL(10, 2) NOT NULL
);
-- Insert with enum values
INSERT INTO orders (user_id, status, amount) VALUES (1, 'processing', 99.99);
-- Enums enforce valid values
INSERT INTO orders (user_id, status, amount) VALUES (1, 'invalid', 50.00);
-- ERROR: invalid input value for enum order_status: "invalid"
-- Add a new value to an existing enum
ALTER TYPE order_status ADD VALUE 'refunded' AFTER 'cancelled';
CREATE, ALTER, and DROP Tables
PostgreSQL's DDL (Data Definition Language) supports all standard SQL operations plus PostgreSQL-specific features like generated columns, table inheritance, and advanced constraints.
CREATE TABLE
-- Comprehensive table creation
CREATE TABLE users (
id BIGSERIAL PRIMARY KEY,
username VARCHAR(50) UNIQUE NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
password_hash TEXT NOT NULL,
full_name TEXT,
bio TEXT,
avatar_url TEXT,
is_active BOOLEAN DEFAULT TRUE,
role VARCHAR(20) DEFAULT 'user' CHECK (role IN ('user', 'admin', 'moderator')),
login_count INTEGER DEFAULT 0,
last_login_at TIMESTAMPTZ,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
-- Table with foreign keys and cascading
CREATE TABLE posts (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
title VARCHAR(300) NOT NULL,
slug VARCHAR(300) UNIQUE NOT NULL,
body TEXT NOT NULL,
excerpt TEXT,
status VARCHAR(20) DEFAULT 'draft' CHECK (status IN ('draft', 'published', 'archived')),
view_count INTEGER DEFAULT 0,
published_at TIMESTAMPTZ,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
-- Table with composite primary key and JSONB
CREATE TABLE post_metadata (
post_id BIGINT REFERENCES posts(id) ON DELETE CASCADE,
key VARCHAR(100) NOT NULL,
value JSONB NOT NULL,
PRIMARY KEY (post_id, key)
);
-- Create table with a generated column (PostgreSQL 12+)
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
price DECIMAL(10, 2) NOT NULL,
tax_rate DECIMAL(4, 3) DEFAULT 0.08,
total_price DECIMAL(10, 2) GENERATED ALWAYS AS (price * (1 + tax_rate)) STORED
);
-- Create table if it does not exist
CREATE TABLE IF NOT EXISTS audit_log (
id BIGSERIAL PRIMARY KEY,
table_name TEXT NOT NULL,
action TEXT NOT NULL,
old_data JSONB,
new_data JSONB,
performed_by BIGINT REFERENCES users(id),
performed_at TIMESTAMPTZ DEFAULT NOW()
);
ALTER TABLE
-- Add a column
ALTER TABLE users ADD COLUMN phone VARCHAR(20);
-- Add a column with a default
ALTER TABLE users ADD COLUMN metadata JSONB DEFAULT '{}';
-- Drop a column
ALTER TABLE users DROP COLUMN IF EXISTS bio;
-- Rename a column
ALTER TABLE users RENAME COLUMN full_name TO display_name;
-- Change column type (PostgreSQL can cast automatically for compatible types)
ALTER TABLE users ALTER COLUMN username TYPE VARCHAR(100);
-- Change column type with explicit cast
ALTER TABLE products ALTER COLUMN price TYPE NUMERIC(12, 2) USING price::NUMERIC(12, 2);
-- Set or drop NOT NULL
ALTER TABLE users ALTER COLUMN phone SET NOT NULL;
ALTER TABLE users ALTER COLUMN phone DROP NOT NULL;
-- Set a default value
ALTER TABLE users ALTER COLUMN role SET DEFAULT 'user';
-- Add a constraint
ALTER TABLE users ADD CONSTRAINT check_username_length CHECK (length(username) >= 3);
-- Add a unique constraint across multiple columns
ALTER TABLE posts ADD CONSTRAINT unique_user_slug UNIQUE (user_id, slug);
-- Drop a constraint
ALTER TABLE users DROP CONSTRAINT IF EXISTS check_username_length;
-- Rename the table
ALTER TABLE users RENAME TO app_users;
DROP TABLE
-- Drop a table
DROP TABLE IF EXISTS temp_data;
-- Drop with CASCADE (removes dependent foreign keys, views, etc.)
DROP TABLE IF EXISTS users CASCADE;
-- Drop multiple tables
DROP TABLE IF EXISTS posts, comments, likes CASCADE;
SELECT with JOINs
PostgreSQL's query planner is one of the best in the industry. It supports all standard JOIN types and optimizes them intelligently based on table statistics, available indexes, and query structure.
INNER JOIN, LEFT JOIN, and Multi-Table Queries
-- INNER JOIN: only matching rows from both tables
SELECT
u.username,
p.title,
p.published_at
FROM users u
INNER JOIN posts p ON u.id = p.user_id
WHERE p.status = 'published'
ORDER BY p.published_at DESC;
-- LEFT JOIN: all users, even those with no posts
SELECT
u.username,
COUNT(p.id) AS post_count,
COALESCE(SUM(p.view_count), 0) AS total_views
FROM users u
LEFT JOIN posts p ON u.id = p.user_id AND p.status = 'published'
GROUP BY u.id, u.username
ORDER BY total_views DESC;
-- Multi-table JOIN with aggregation
SELECT
u.username,
p.title,
COUNT(c.id) AS comment_count,
COUNT(DISTINCT l.user_id) AS like_count
FROM posts p
INNER JOIN users u ON p.user_id = u.id
LEFT JOIN comments c ON p.id = c.post_id
LEFT JOIN likes l ON p.id = l.post_id
WHERE p.status = 'published'
GROUP BY u.username, p.id, p.title
ORDER BY like_count DESC
LIMIT 20;
-- FULL OUTER JOIN: find orphaned records
SELECT
u.id AS user_id,
u.username,
p.id AS profile_id,
p.bio
FROM users u
FULL OUTER JOIN profiles p ON u.id = p.user_id
WHERE u.id IS NULL OR p.user_id IS NULL;
-- LATERAL JOIN: correlated subquery in FROM clause (PostgreSQL-specific)
-- Get each user's 3 most recent posts
SELECT u.username, recent.title, recent.published_at
FROM users u
CROSS JOIN LATERAL (
SELECT title, published_at
FROM posts p
WHERE p.user_id = u.id AND p.status = 'published'
ORDER BY p.published_at DESC
LIMIT 3
) recent
ORDER BY u.username, recent.published_at DESC;
PostgreSQL-Specific Query Features
-- DISTINCT ON: get the first row per group (PostgreSQL only)
-- Most recent post per user:
SELECT DISTINCT ON (user_id)
user_id, title, published_at
FROM posts
WHERE status = 'published'
ORDER BY user_id, published_at DESC;
-- RETURNING clause: get affected rows after INSERT/UPDATE/DELETE
INSERT INTO users (username, email, password_hash)
VALUES ('newuser', 'new@example.com', 'hash123')
RETURNING id, username, created_at;
UPDATE posts SET view_count = view_count + 1
WHERE id = 42
RETURNING id, title, view_count;
-- UPSERT with ON CONFLICT
INSERT INTO page_views (page_url, view_date, count)
VALUES ('/blog/postgresql-guide', '2026-02-11', 1)
ON CONFLICT (page_url, view_date)
DO UPDATE SET count = page_views.count + 1
RETURNING *;
-- generate_series: create sequences of values
SELECT generate_series('2026-01-01'::date, '2026-01-31'::date, '1 day'::interval) AS date;
Indexes: B-tree, GIN, GiST, and More
PostgreSQL supports more index types than any other open-source database. Choosing the right index type for your query patterns is one of the most impactful performance decisions you can make.
B-tree Indexes (Default)
B-tree is the default and most common index type. It handles equality and range queries efficiently and supports sorting.
-- Basic B-tree index
CREATE INDEX idx_posts_user_id ON posts (user_id);
-- Composite index (column order matters)
CREATE INDEX idx_posts_user_status ON posts (user_id, status);
-- Helps: WHERE user_id = 1 AND status = 'published'
-- Helps: WHERE user_id = 1 (uses leftmost column)
-- Does NOT help: WHERE status = 'published' (skips first column)
-- Unique index
CREATE UNIQUE INDEX idx_users_email ON users (email);
-- Partial index: only index a subset of rows
CREATE INDEX idx_published_posts ON posts (published_at DESC)
WHERE status = 'published';
-- Smaller, faster index for the most common query pattern
-- Expression index: index a computed value
CREATE INDEX idx_users_lower_email ON users (LOWER(email));
-- Now this query uses the index:
-- SELECT * FROM users WHERE LOWER(email) = 'alice@example.com';
-- Covering index with INCLUDE (PostgreSQL 11+)
CREATE INDEX idx_posts_user_covering ON posts (user_id)
INCLUDE (title, published_at);
-- Index-only scan: avoids reading the table for these columns
GIN Indexes (Generalized Inverted Index)
GIN indexes are designed for values that contain multiple elements: arrays, JSONB, and full-text search vectors. They index each element separately, so lookups for "does this array contain X?" are fast.
-- GIN index on a JSONB column
CREATE INDEX idx_metadata_gin ON post_metadata USING GIN (value);
-- GIN index for JSONB path operations
CREATE INDEX idx_users_metadata ON users USING GIN (metadata jsonb_path_ops);
-- Optimized for @> containment queries:
-- SELECT * FROM users WHERE metadata @> '{"plan": "premium"}';
-- GIN index on an array column
CREATE INDEX idx_articles_tags ON articles USING GIN (tags);
-- Enables: WHERE tags @> ARRAY['postgresql']
-- Enables: WHERE 'postgresql' = ANY(tags)
-- GIN index for full-text search
CREATE INDEX idx_posts_fts ON posts USING GIN (to_tsvector('english', title || ' ' || body));
GiST Indexes (Generalized Search Tree)
GiST indexes handle data types that do not fit neatly into a B-tree: geometric shapes, ranges, and full-text search. They support "nearest neighbor" searches and containment queries.
-- GiST index for range types
CREATE TABLE events (
id SERIAL PRIMARY KEY,
name TEXT,
duration TSTZRANGE NOT NULL
);
CREATE INDEX idx_events_duration ON events USING GIST (duration);
-- Enables: WHERE duration && '[2026-02-11, 2026-02-12)' (overlaps)
-- Enables: WHERE duration @> '2026-02-11 14:00:00'::timestamptz (contains)
-- GiST index for geographic points (PostGIS or built-in)
CREATE INDEX idx_locations_coords ON locations USING GIST (coordinates);
-- GiST for full-text search (smaller than GIN but slower for lookups)
CREATE INDEX idx_posts_fts_gist ON posts USING GIST (to_tsvector('english', body));
BRIN Indexes (Block Range Index)
BRIN indexes are extremely small and efficient for large tables where data is naturally ordered (such as time-series data where rows are inserted chronologically).
-- BRIN index on a timestamp column (ideal for append-only tables)
CREATE INDEX idx_logs_created_brin ON logs USING BRIN (created_at);
-- 1000x smaller than a B-tree index on the same column
-- Works because rows are inserted in chronological order
-- BRIN with custom pages_per_range
CREATE INDEX idx_events_time_brin ON events USING BRIN (created_at)
WITH (pages_per_range = 32);
CTEs and Recursive Queries
Common Table Expressions (CTEs) break complex queries into named, readable steps. PostgreSQL also supports recursive CTEs for traversing hierarchical data like org charts, category trees, and graph structures.
-- Basic CTE: break a complex query into steps
WITH active_users AS (
SELECT id, username, email
FROM users
WHERE is_active = TRUE AND last_login_at > NOW() - INTERVAL '30 days'
),
user_post_stats AS (
SELECT
au.id,
au.username,
COUNT(p.id) AS post_count,
COALESCE(SUM(p.view_count), 0) AS total_views
FROM active_users au
LEFT JOIN posts p ON au.id = p.user_id AND p.status = 'published'
GROUP BY au.id, au.username
)
SELECT
username,
post_count,
total_views,
CASE
WHEN total_views > 10000 THEN 'influencer'
WHEN total_views > 1000 THEN 'active creator'
WHEN post_count > 0 THEN 'contributor'
ELSE 'reader'
END AS user_tier
FROM user_post_stats
ORDER BY total_views DESC;
-- Recursive CTE: traverse a category tree
WITH RECURSIVE category_tree AS (
-- Base case: top-level categories (no parent)
SELECT id, name, parent_id, name::TEXT AS full_path, 0 AS depth
FROM categories
WHERE parent_id IS NULL
UNION ALL
-- Recursive case: children of the current level
SELECT c.id, c.name, c.parent_id,
ct.full_path || ' > ' || c.name,
ct.depth + 1
FROM categories c
INNER JOIN category_tree ct ON c.parent_id = ct.id
)
SELECT
REPEAT(' ', depth) || name AS indented_name,
full_path,
depth
FROM category_tree
ORDER BY full_path;
/* Result:
indented_name | full_path | depth
-------------------+----------------------------------+------
Electronics | Electronics | 0
Computers | Electronics > Computers | 1
Laptops | Electronics > Computers > Laptops| 2
Desktops | Electronics > Computers > Desktops| 2
Phones | Electronics > Phones | 1
Books | Books | 0
Fiction | Books > Fiction | 1
*/
-- CTE with INSERT (materialized by default in PostgreSQL)
WITH deleted_posts AS (
DELETE FROM posts
WHERE status = 'archived' AND updated_at < NOW() - INTERVAL '1 year'
RETURNING *
)
INSERT INTO deleted_posts_archive
SELECT * FROM deleted_posts;
Window Functions
Window functions compute values across a set of rows related to the current row without collapsing them. PostgreSQL has excellent support for all SQL standard window functions plus some extras.
-- ROW_NUMBER: assign ranks within partitions
SELECT
username,
title,
view_count,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY view_count DESC) AS rank_per_user
FROM posts p
INNER JOIN users u ON p.user_id = u.id
WHERE p.status = 'published';
-- Running total of daily revenue
SELECT
order_date,
daily_total,
SUM(daily_total) OVER (ORDER BY order_date) AS running_total,
AVG(daily_total) OVER (
ORDER BY order_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS moving_avg_7day
FROM (
SELECT order_date, SUM(amount) AS daily_total
FROM orders
GROUP BY order_date
) daily
ORDER BY order_date;
-- LAG and LEAD: compare with previous/next rows
SELECT
order_date,
daily_total,
LAG(daily_total) OVER (ORDER BY order_date) AS prev_day,
daily_total - LAG(daily_total) OVER (ORDER BY order_date) AS day_over_day_change,
ROUND(
(daily_total - LAG(daily_total) OVER (ORDER BY order_date))
/ NULLIF(LAG(daily_total) OVER (ORDER BY order_date), 0) * 100, 1
) AS pct_change
FROM (
SELECT order_date, SUM(amount) AS daily_total
FROM orders GROUP BY order_date
) daily;
-- NTILE: divide rows into equal buckets
SELECT
username,
total_spent,
NTILE(4) OVER (ORDER BY total_spent DESC) AS spending_quartile
FROM (
SELECT u.username, SUM(o.amount) AS total_spent
FROM users u
INNER JOIN orders o ON u.id = o.user_id
GROUP BY u.username
) user_spending;
-- FIRST_VALUE and LAST_VALUE
SELECT
product_name,
category,
price,
FIRST_VALUE(product_name) OVER (
PARTITION BY category ORDER BY price DESC
) AS most_expensive_in_category,
price - FIRST_VALUE(price) OVER (
PARTITION BY category ORDER BY price DESC
) AS diff_from_top
FROM products;
JSONB Operations
PostgreSQL's JSONB type lets you store semi-structured data alongside relational data. It is fully indexable, supports rich query operators, and can replace a document database for many use cases. JSONB stores data in a decomposed binary format, which is faster to query than the plain JSON type.
Storing and Querying JSONB
-- Create a table with JSONB
CREATE TABLE events (
id BIGSERIAL PRIMARY KEY,
event_type VARCHAR(50) NOT NULL,
payload JSONB NOT NULL DEFAULT '{}',
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Insert JSONB data
INSERT INTO events (event_type, payload) VALUES
('user_signup', '{"user_id": 1, "email": "alice@example.com", "plan": "free", "source": "organic"}'),
('page_view', '{"user_id": 1, "url": "/dashboard", "duration_ms": 3400, "device": {"type": "desktop", "os": "macOS"}}'),
('purchase', '{"user_id": 1, "items": [{"id": 101, "name": "Pro Plan", "price": 29.99}], "total": 29.99}');
-- Extract values with -> (returns JSON) and ->> (returns text)
SELECT
event_type,
payload->'user_id' AS user_id_json, -- returns JSON: 1
payload->>'user_id' AS user_id_text, -- returns text: "1"
(payload->>'user_id')::INTEGER AS user_id -- cast to integer: 1
FROM events;
-- Nested access
SELECT payload->'device'->>'type' AS device_type
FROM events
WHERE event_type = 'page_view';
-- Access array elements (0-indexed)
SELECT payload->'items'->0->>'name' AS first_item_name
FROM events
WHERE event_type = 'purchase';
JSONB Query Operators
-- @> containment: does the JSONB contain this subset?
SELECT * FROM events
WHERE payload @> '{"plan": "free"}';
-- <@ contained by: is the JSONB contained within this value?
SELECT * FROM events
WHERE '{"plan": "free", "source": "organic", "extra": true}'::jsonb @> payload;
-- ? key exists
SELECT * FROM events WHERE payload ? 'email';
-- ?| any key exists
SELECT * FROM events WHERE payload ?| ARRAY['email', 'phone'];
-- ?& all keys exist
SELECT * FROM events WHERE payload ?& ARRAY['user_id', 'email'];
-- jsonb_path_query (SQL/JSON path, PostgreSQL 12+)
SELECT jsonb_path_query(payload, '$.items[*].name') AS item_names
FROM events
WHERE event_type = 'purchase';
-- Filter with jsonb_path_exists
SELECT * FROM events
WHERE jsonb_path_exists(payload, '$.items[*] ? (@.price > 20)');
Modifying JSONB
-- Update a key
UPDATE events
SET payload = jsonb_set(payload, '{plan}', '"premium"')
WHERE event_type = 'user_signup' AND payload->>'user_id' = '1';
-- Add a new key
UPDATE events
SET payload = payload || '{"upgraded": true, "upgrade_date": "2026-02-11"}'::jsonb
WHERE event_type = 'user_signup';
-- Remove a key
UPDATE events
SET payload = payload - 'source'
WHERE event_type = 'user_signup';
-- Remove a nested key
UPDATE events
SET payload = payload #- '{device,os}'
WHERE event_type = 'page_view';
-- Aggregate JSONB: build JSON from query results
SELECT jsonb_agg(jsonb_build_object(
'id', id,
'type', event_type,
'timestamp', created_at
)) AS events_json
FROM events
WHERE created_at > NOW() - INTERVAL '1 day';
Indexing JSONB
-- GIN index for all JSONB operations
CREATE INDEX idx_events_payload ON events USING GIN (payload);
-- Supports: @>, ?, ?|, ?& operators
-- GIN with jsonb_path_ops (smaller, faster for @> only)
CREATE INDEX idx_events_payload_path ON events USING GIN (payload jsonb_path_ops);
-- Only supports @> containment, but uses less space and is faster
-- B-tree index on a specific extracted value
CREATE INDEX idx_events_user_id ON events ((payload->>'user_id'));
-- Enables: WHERE payload->>'user_id' = '1'
-- Expression index with cast
CREATE INDEX idx_events_total ON events (((payload->>'total')::DECIMAL))
WHERE event_type = 'purchase';
-- Enables: WHERE (payload->>'total')::DECIMAL > 100
Full-Text Search
PostgreSQL has built-in full-text search that rivals dedicated search engines for many use cases. No external service needed. It supports stemming, ranking, phrase matching, and multiple languages.
-- Basic full-text search
SELECT title, body
FROM posts
WHERE to_tsvector('english', title || ' ' || body) @@ to_tsquery('english', 'postgresql & performance');
-- tsvector: the document representation (normalized, stemmed tokens)
SELECT to_tsvector('english', 'PostgreSQL is a powerful relational database system');
-- Result: 'databas':6 'power':4 'postgresql':1 'relat':5 'system':7
-- tsquery: the search query
SELECT to_tsquery('english', 'database & (postgresql | mysql)');
-- Result: 'databas' & ( 'postgresql' | 'mysql' )
-- Search operators:
-- & = AND
-- | = OR
-- ! = NOT
-- <-> = FOLLOWED BY (phrase search)
-- Phrase search: words must be adjacent
SELECT * FROM posts
WHERE to_tsvector('english', body) @@ to_tsquery('english', 'full <-> text <-> search');
-- Add a stored tsvector column for performance
ALTER TABLE posts ADD COLUMN search_vector TSVECTOR;
UPDATE posts SET search_vector =
setweight(to_tsvector('english', COALESCE(title, '')), 'A') ||
setweight(to_tsvector('english', COALESCE(excerpt, '')), 'B') ||
setweight(to_tsvector('english', COALESCE(body, '')), 'C');
-- Create GIN index on the stored vector
CREATE INDEX idx_posts_search ON posts USING GIN (search_vector);
-- Search with ranking
SELECT
title,
ts_rank(search_vector, query) AS rank,
ts_headline('english', body, query, 'StartSel=<b>, StopSel=</b>, MaxWords=50') AS snippet
FROM posts, to_tsquery('english', 'postgresql & guide') AS query
WHERE search_vector @@ query
ORDER BY rank DESC
LIMIT 10;
-- Keep the search vector updated with a trigger
CREATE OR REPLACE FUNCTION posts_search_vector_trigger() RETURNS trigger AS $$
BEGIN
NEW.search_vector :=
setweight(to_tsvector('english', COALESCE(NEW.title, '')), 'A') ||
setweight(to_tsvector('english', COALESCE(NEW.excerpt, '')), 'B') ||
setweight(to_tsvector('english', COALESCE(NEW.body, '')), 'C');
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg_posts_search_vector
BEFORE INSERT OR UPDATE ON posts
FOR EACH ROW EXECUTE FUNCTION posts_search_vector_trigger();
Transactions and ACID
PostgreSQL is fully ACID-compliant. Its Multi-Version Concurrency Control (MVCC) implementation means readers never block writers and writers never block readers, which gives it excellent concurrent performance.
-- Basic transaction
BEGIN;
UPDATE accounts SET balance = balance - 500.00 WHERE id = 1;
UPDATE accounts SET balance = balance + 500.00 WHERE id = 2;
INSERT INTO transfers (from_account, to_account, amount, created_at)
VALUES (1, 2, 500.00, NOW());
COMMIT;
-- Transaction with error handling (in PL/pgSQL or application code)
BEGIN;
-- Debit sender
UPDATE accounts SET balance = balance - 500.00 WHERE id = 1;
-- Verify sufficient funds
DO $$
BEGIN
IF (SELECT balance FROM accounts WHERE id = 1) < 0 THEN
RAISE EXCEPTION 'Insufficient funds in account 1';
END IF;
END $$;
-- Credit receiver
UPDATE accounts SET balance = balance + 500.00 WHERE id = 2;
COMMIT;
-- If RAISE EXCEPTION fires, the transaction automatically rolls back
-- Savepoints for partial rollback
BEGIN;
INSERT INTO orders (user_id, amount) VALUES (1, 99.99);
SAVEPOINT after_order;
INSERT INTO order_items (order_id, product_id, quantity) VALUES (currval('orders_id_seq'), 999, 1);
-- product_id 999 does not exist, this will fail
ROLLBACK TO SAVEPOINT after_order;
-- The order INSERT is still intact
INSERT INTO order_items (order_id, product_id, quantity) VALUES (currval('orders_id_seq'), 42, 1);
COMMIT;
-- Isolation levels
BEGIN ISOLATION LEVEL SERIALIZABLE;
-- Strictest isolation: transactions behave as if they ran sequentially
-- Use for financial operations and inventory management
SELECT balance FROM accounts WHERE id = 1 FOR UPDATE; -- Row-level lock
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
COMMIT;
-- Advisory locks: application-level locking
SELECT pg_advisory_lock(12345); -- Acquire lock (blocks until available)
-- ... do exclusive work ...
SELECT pg_advisory_unlock(12345); -- Release lock
-- Try lock (non-blocking)
SELECT pg_try_advisory_lock(12345); -- Returns TRUE if acquired, FALSE if not
Stored Procedures and Functions
PostgreSQL supports functions (return a value) and procedures (PostgreSQL 11+, can manage transactions). PL/pgSQL is the built-in procedural language, but PostgreSQL also supports PL/Python, PL/Perl, PL/V8 (JavaScript), and SQL functions.
-- Simple SQL function
CREATE OR REPLACE FUNCTION get_user_post_count(uid BIGINT)
RETURNS INTEGER AS $$
SELECT COUNT(*)::INTEGER FROM posts WHERE user_id = uid AND status = 'published';
$$ LANGUAGE sql STABLE;
-- Usage: SELECT username, get_user_post_count(id) FROM users;
-- PL/pgSQL function with logic
CREATE OR REPLACE FUNCTION transfer_funds(
from_id BIGINT,
to_id BIGINT,
transfer_amount DECIMAL
) RETURNS BOOLEAN AS $$
DECLARE
sender_balance DECIMAL;
BEGIN
-- Check sender balance
SELECT balance INTO sender_balance
FROM accounts WHERE id = from_id FOR UPDATE;
IF sender_balance IS NULL THEN
RAISE EXCEPTION 'Account % not found', from_id;
END IF;
IF sender_balance < transfer_amount THEN
RAISE EXCEPTION 'Insufficient funds: balance=%, requested=%', sender_balance, transfer_amount;
END IF;
-- Perform transfer
UPDATE accounts SET balance = balance - transfer_amount WHERE id = from_id;
UPDATE accounts SET balance = balance + transfer_amount WHERE id = to_id;
-- Log the transfer
INSERT INTO transfers (from_account, to_account, amount)
VALUES (from_id, to_id, transfer_amount);
RETURN TRUE;
END;
$$ LANGUAGE plpgsql;
-- Usage: SELECT transfer_funds(1, 2, 250.00);
-- Function returning a table (set-returning function)
CREATE OR REPLACE FUNCTION get_top_posts(limit_count INTEGER DEFAULT 10)
RETURNS TABLE (
post_id BIGINT,
title TEXT,
author TEXT,
view_count INTEGER,
published_at TIMESTAMPTZ
) AS $$
BEGIN
RETURN QUERY
SELECT p.id, p.title, u.username, p.view_count, p.published_at
FROM posts p
INNER JOIN users u ON p.user_id = u.id
WHERE p.status = 'published'
ORDER BY p.view_count DESC
LIMIT limit_count;
END;
$$ LANGUAGE plpgsql STABLE;
-- Usage: SELECT * FROM get_top_posts(5);
-- Procedure (PostgreSQL 11+): can commit/rollback within
CREATE OR REPLACE PROCEDURE cleanup_old_data(days_old INTEGER DEFAULT 365) AS $$
DECLARE
deleted_count INTEGER;
BEGIN
DELETE FROM audit_log WHERE performed_at < NOW() - (days_old || ' days')::INTERVAL;
GET DIAGNOSTICS deleted_count = ROW_COUNT;
RAISE NOTICE 'Deleted % audit log entries', deleted_count;
DELETE FROM sessions WHERE expires_at < NOW();
GET DIAGNOSTICS deleted_count = ROW_COUNT;
RAISE NOTICE 'Deleted % expired sessions', deleted_count;
COMMIT;
END;
$$ LANGUAGE plpgsql;
-- Usage: CALL cleanup_old_data(90);
Triggers
Triggers automatically execute functions in response to INSERT, UPDATE, or DELETE events on a table. They are useful for maintaining audit logs, enforcing complex business rules, and keeping derived data in sync.
-- Automatically update the updated_at timestamp
CREATE OR REPLACE FUNCTION update_timestamp() RETURNS trigger AS $$
BEGIN
NEW.updated_at = NOW();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg_users_updated_at
BEFORE UPDATE ON users
FOR EACH ROW EXECUTE FUNCTION update_timestamp();
CREATE TRIGGER trg_posts_updated_at
BEFORE UPDATE ON posts
FOR EACH ROW EXECUTE FUNCTION update_timestamp();
-- Audit log trigger: track all changes to a table
CREATE OR REPLACE FUNCTION audit_trigger() RETURNS trigger AS $$
BEGIN
IF TG_OP = 'INSERT' THEN
INSERT INTO audit_log (table_name, action, new_data, performed_at)
VALUES (TG_TABLE_NAME, 'INSERT', to_jsonb(NEW), NOW());
RETURN NEW;
ELSIF TG_OP = 'UPDATE' THEN
INSERT INTO audit_log (table_name, action, old_data, new_data, performed_at)
VALUES (TG_TABLE_NAME, 'UPDATE', to_jsonb(OLD), to_jsonb(NEW), NOW());
RETURN NEW;
ELSIF TG_OP = 'DELETE' THEN
INSERT INTO audit_log (table_name, action, old_data, performed_at)
VALUES (TG_TABLE_NAME, 'DELETE', to_jsonb(OLD), NOW());
RETURN OLD;
END IF;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg_users_audit
AFTER INSERT OR UPDATE OR DELETE ON users
FOR EACH ROW EXECUTE FUNCTION audit_trigger();
-- Prevent deletion of admin users
CREATE OR REPLACE FUNCTION prevent_admin_delete() RETURNS trigger AS $$
BEGIN
IF OLD.role = 'admin' THEN
RAISE EXCEPTION 'Cannot delete admin user: %', OLD.username;
END IF;
RETURN OLD;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg_prevent_admin_delete
BEFORE DELETE ON users
FOR EACH ROW EXECUTE FUNCTION prevent_admin_delete();
Performance Tuning with EXPLAIN ANALYZE
EXPLAIN ANALYZE is the single most important tool for understanding and optimizing query performance. It runs the query and shows the execution plan with actual timing data.
-- Basic EXPLAIN (shows plan without running the query)
EXPLAIN SELECT * FROM posts WHERE user_id = 42;
-- EXPLAIN ANALYZE (runs the query and shows actual times)
EXPLAIN ANALYZE SELECT * FROM posts WHERE user_id = 42;
-- Full output with buffers and timing
EXPLAIN (ANALYZE, BUFFERS, TIMING, FORMAT TEXT)
SELECT u.username, COUNT(p.id) AS post_count
FROM users u
LEFT JOIN posts p ON u.id = p.user_id
GROUP BY u.id, u.username
ORDER BY post_count DESC
LIMIT 10;
/* How to read the output:
Limit (cost=156.45..156.48 rows=10 width=44) (actual time=2.341..2.345 rows=10 loops=1)
-> Sort (cost=156.45..159.20 rows=1100 width=44) (actual time=2.340..2.342 rows=10 loops=1)
Sort Key: (count(p.id)) DESC
Sort Method: top-N heapsort Memory: 25kB
-> HashAggregate (cost=132.25..143.25 rows=1100 width=44) (actual time=2.125..2.257 rows=1100 loops=1)
Group Key: u.id
-> Hash Right Join (cost=36.50..120.75 rows=2300 width=40) (actual time=0.456..1.623 rows=2300 loops=1)
Hash Cond: (p.user_id = u.id)
-> Seq Scan on posts p (cost=0.00..62.00 rows=2300 width=12) (actual time=0.010..0.412 rows=2300 loops=1)
-> Hash (cost=22.00..22.00 rows=1100 width=36) (actual time=0.425..0.426 rows=1100 loops=1)
Buckets: 2048 Batches: 1 Memory Usage: 81kB
-> Seq Scan on users u (cost=0.00..22.00 rows=1100 width=36) (actual time=0.006..0.171 rows=1100 loops=1)
Planning Time: 0.285 ms
Execution Time: 2.401 ms
Key things to look for:
- Seq Scan on large tables: add an index
- Nested Loop with large outer table: consider Hash Join (may need more work_mem)
- Actual rows much higher than estimated: run ANALYZE to update statistics
- Sort using disk (external merge): increase work_mem
- Buffers shared read (high): data not in cache, may need more shared_buffers
*/
-- Common performance fixes:
-- 1. Add missing index
CREATE INDEX idx_posts_user_id ON posts (user_id);
-- Re-run EXPLAIN to verify Seq Scan changed to Index Scan
-- 2. Update statistics for accurate planning
ANALYZE posts;
ANALYZE users;
-- 3. Find and fix slow queries
-- Enable pg_stat_statements extension
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
SELECT
query,
calls,
ROUND(mean_exec_time::NUMERIC, 2) AS avg_ms,
ROUND(total_exec_time::NUMERIC, 2) AS total_ms,
rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;
-- 4. Find tables that need vacuuming
SELECT
schemaname || '.' || relname AS table,
n_dead_tup AS dead_rows,
n_live_tup AS live_rows,
last_autovacuum,
last_autoanalyze
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;
Table Partitioning
Partitioning splits a large table into smaller physical pieces while maintaining a single logical table. It dramatically improves query performance and maintenance on tables with millions or billions of rows.
-- Range partitioning by date (most common)
CREATE TABLE events (
id BIGSERIAL,
event_type VARCHAR(50) NOT NULL,
payload JSONB,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
) PARTITION BY RANGE (created_at);
-- Create monthly partitions
CREATE TABLE events_2026_01 PARTITION OF events
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE events_2026_02 PARTITION OF events
FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
CREATE TABLE events_2026_03 PARTITION OF events
FOR VALUES FROM ('2026-03-01') TO ('2026-04-01');
-- Create indexes on the partitioned table (applied to all partitions)
CREATE INDEX idx_events_created ON events (created_at);
CREATE INDEX idx_events_type ON events (event_type, created_at);
-- Queries automatically target the right partition (partition pruning)
EXPLAIN ANALYZE
SELECT * FROM events
WHERE created_at >= '2026-02-01' AND created_at < '2026-03-01';
-- Only scans events_2026_02, not all partitions
-- List partitioning by category
CREATE TABLE orders (
id BIGSERIAL,
region VARCHAR(20) NOT NULL,
amount DECIMAL(10, 2),
created_at TIMESTAMPTZ DEFAULT NOW()
) PARTITION BY LIST (region);
CREATE TABLE orders_us PARTITION OF orders FOR VALUES IN ('us-east', 'us-west');
CREATE TABLE orders_eu PARTITION OF orders FOR VALUES IN ('eu-west', 'eu-central');
CREATE TABLE orders_ap PARTITION OF orders FOR VALUES IN ('ap-south', 'ap-east');
-- Hash partitioning for even distribution
CREATE TABLE sessions (
id UUID PRIMARY KEY,
user_id BIGINT NOT NULL,
data JSONB,
expires_at TIMESTAMPTZ
) PARTITION BY HASH (id);
CREATE TABLE sessions_0 PARTITION OF sessions FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE sessions_1 PARTITION OF sessions FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE sessions_2 PARTITION OF sessions FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE sessions_3 PARTITION OF sessions FOR VALUES WITH (MODULUS 4, REMAINDER 3);
-- Detach old partitions for archival
ALTER TABLE events DETACH PARTITION events_2026_01;
-- Now events_2026_01 is a standalone table you can archive or drop
Replication
PostgreSQL supports both physical (streaming) replication for high availability and logical replication for selective data distribution. Streaming replication creates an exact copy of the entire database cluster on a standby server.
-- STREAMING REPLICATION SETUP
-- On the primary server (postgresql.conf):
-- wal_level = replica
-- max_wal_senders = 5
-- wal_keep_size = '1GB'
-- Create a replication user:
CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'repl_secure_password';
-- On the primary: allow replication in pg_hba.conf:
-- host replication replicator standby_ip/32 scram-sha-256
-- On the standby server:
-- pg_basebackup -h primary_ip -D /var/lib/postgresql/16/main -U replicator -Fp -Xs -P -R
-- The -R flag creates standby.signal and sets primary_conninfo automatically
-- Check replication status on primary:
SELECT
client_addr,
state,
sent_lsn,
write_lsn,
flush_lsn,
replay_lsn,
sent_lsn - replay_lsn AS replication_lag
FROM pg_stat_replication;
-- LOGICAL REPLICATION (selective table replication)
-- On the publisher (source database):
CREATE PUBLICATION my_pub FOR TABLE users, posts, orders;
-- Or replicate all tables:
-- CREATE PUBLICATION my_pub FOR ALL TABLES;
-- On the subscriber (target database):
-- First create the same table structures, then:
CREATE SUBSCRIPTION my_sub
CONNECTION 'host=publisher_ip dbname=myapp user=replicator password=secret'
PUBLICATION my_pub;
-- Check subscription status:
SELECT * FROM pg_stat_subscription;
-- Advantages of logical replication over streaming:
-- 1. Replicate specific tables, not the entire database
-- 2. Replicate between different PostgreSQL versions
-- 3. Subscriber can have its own indexes and additional tables
-- 4. Multiple publishers can feed into one subscriber
Backup and Restore
PostgreSQL provides several backup strategies. The right choice depends on your database size, recovery time objectives, and whether you need point-in-time recovery.
# pg_dump: logical backup of a single database
# Creates a SQL script or custom-format archive
# Plain SQL dump (readable, but slower to restore)
pg_dump -h localhost -U myuser -d myapp > myapp_backup.sql
# Custom format (compressed, supports parallel restore)
pg_dump -h localhost -U myuser -Fc -d myapp -f myapp_backup.dump
# Dump specific tables
pg_dump -h localhost -U myuser -d myapp -t users -t posts -Fc -f tables_backup.dump
# Dump schema only (no data)
pg_dump -h localhost -U myuser -d myapp --schema-only -f schema.sql
# Dump data only (no schema)
pg_dump -h localhost -U myuser -d myapp --data-only -Fc -f data.dump
# pg_dumpall: backup all databases and global objects (roles, tablespaces)
pg_dumpall -h localhost -U postgres > full_cluster_backup.sql
# Restore from plain SQL
psql -h localhost -U myuser -d myapp < myapp_backup.sql
# Restore from custom format (supports parallel restore)
pg_restore -h localhost -U myuser -d myapp -j 4 myapp_backup.dump
# -j 4: use 4 parallel jobs for faster restore
# Restore specific tables from a custom dump
pg_restore -h localhost -U myuser -d myapp -t users -t posts myapp_backup.dump
# PHYSICAL BACKUP with pg_basebackup (for streaming replication and PITR)
pg_basebackup -h localhost -U replicator -D /backup/base -Fp -Xs -P
# Point-in-Time Recovery (PITR):
# 1. Restore the base backup
# 2. Configure recovery_target_time in postgresql.conf:
# recovery_target_time = '2026-02-11 14:30:00'
# restore_command = 'cp /backup/wal/%f %p'
# 3. Create recovery.signal file
# 4. Start PostgreSQL — it replays WAL files up to the target time
# Automated backup script example:
# #!/bin/bash
# DATE=$(date +%Y%m%d_%H%M%S)
# pg_dump -Fc -h localhost -U myuser myapp -f "/backup/myapp_${DATE}.dump"
# find /backup -name "myapp_*.dump" -mtime +30 -delete # Keep 30 days
Security and Roles
PostgreSQL has a robust role-based access control system. Roles can own database objects, have login privileges, and be members of other roles. Following the principle of least privilege is essential for production security.
-- Create roles with specific privileges
CREATE ROLE readonly_user WITH LOGIN PASSWORD 'read_pass';
CREATE ROLE readwrite_user WITH LOGIN PASSWORD 'rw_pass';
CREATE ROLE app_admin WITH LOGIN PASSWORD 'admin_pass' CREATEDB;
-- Grant schema-level access
GRANT USAGE ON SCHEMA public TO readonly_user;
GRANT USAGE ON SCHEMA public TO readwrite_user;
-- Grant read-only access to all tables
GRANT SELECT ON ALL TABLES IN SCHEMA public TO readonly_user;
-- Grant read-write access
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO readwrite_user;
GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA public TO readwrite_user;
-- Make grants apply to future tables too
ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT ON TABLES TO readonly_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO readwrite_user;
-- Group roles (role inheritance)
CREATE ROLE developers;
GRANT readwrite_user TO developers;
GRANT developers TO alice, bob, charlie;
-- Row-Level Security (RLS): restrict which rows users can see
ALTER TABLE posts ENABLE ROW LEVEL SECURITY;
-- Users can only see their own posts
CREATE POLICY user_posts_policy ON posts
FOR ALL
USING (user_id = current_setting('app.current_user_id')::BIGINT);
-- Admins can see all posts
CREATE POLICY admin_posts_policy ON posts
FOR ALL
TO app_admin
USING (TRUE);
-- Set the current user ID in your application before queries:
SET app.current_user_id = '42';
SELECT * FROM posts; -- Only returns posts where user_id = 42
-- Revoke public access (security best practice)
REVOKE ALL ON DATABASE myapp FROM PUBLIC;
REVOKE ALL ON SCHEMA public FROM PUBLIC;
GRANT USAGE ON SCHEMA public TO readonly_user, readwrite_user, app_admin;
-- Password security
ALTER ROLE myuser WITH PASSWORD 'new_secure_password';
-- Connection limits
ALTER ROLE myuser WITH CONNECTION LIMIT 10;
-- SSL enforcement in pg_hba.conf:
-- hostssl all all 0.0.0.0/0 scram-sha-256
PostgreSQL vs MySQL
PostgreSQL and MySQL are both excellent open-source databases, but they have different strengths. Here is an honest comparison to help you choose.
| Feature | PostgreSQL | MySQL |
|---|---|---|
| SQL Compliance | Highly standards-compliant | Partial compliance, more lenient |
| JSONB Support | Full JSONB with indexing and operators | JSON type, fewer query operators |
| Index Types | B-tree, GIN, GiST, SP-GiST, BRIN, Hash | B-tree, Hash, Full-text, Spatial |
| Full-Text Search | Built-in with ranking and stemming | Built-in but less flexible |
| Concurrency | MVCC, readers never block writers | InnoDB MVCC, but less granular |
| Partitioning | Declarative (range, list, hash) | Range, list, hash, key |
| Window Functions | Full support since version 8.4 | Added in MySQL 8.0 |
| CTEs | Full support including recursive | Added in MySQL 8.0 |
| Extensibility | Custom types, operators, extensions (PostGIS, pg_trgm) | Plugins, but fewer extension points |
| Ease of Setup | Slightly more configuration needed | Simpler initial setup |
| Hosting Availability | Most cloud providers, growing fast | Nearly every hosting provider |
| License | PostgreSQL License (permissive, BSD-like) | GPLv2 (Oracle-owned) |
| Best For | Complex queries, data integrity, advanced features | Simple CRUD, read-heavy workloads, wide ecosystem |
Choose PostgreSQL when: You need JSONB, advanced indexing, full-text search, complex queries, strong data integrity, or extensibility. PostgreSQL is the better default choice for new projects in 2026.
Choose MySQL when: You need maximum hosting compatibility (shared hosting, WordPress), you have a primarily read-heavy workload, or your team already has deep MySQL expertise. MySQL remains an excellent database for straightforward applications.
Summary
PostgreSQL gives you a complete data platform in a single open-source package. Here is a practical roadmap for building your PostgreSQL expertise:
- Start with tables, data types, and basic queries — CREATE TABLE, SELECT, WHERE, JOINs. This handles 80% of daily work.
- Add indexes strategically — index your foreign keys and WHERE clause columns. Use EXPLAIN ANALYZE to verify.
- Use CTEs to organize complex queries — the WITH clause makes multi-step queries readable and maintainable.
- Learn JSONB for semi-structured data — it eliminates the need for a separate document database in most cases.
- Use window functions for analytics — ROW_NUMBER, LAG/LEAD, and running totals replace many self-join workarounds.
- Set up proper backups from day one — pg_dump with custom format and automated scheduling.
- Use transactions for multi-step operations — BEGIN/COMMIT ensures data integrity.
- Plan for scale with partitioning and replication — these features let PostgreSQL handle billions of rows.
PostgreSQL has been trusted by companies from startups to Fortune 500 enterprises for decades. Its combination of reliability, features, and performance makes it the strongest default choice for any new project that needs a relational database.
Learn More
- SQL Formatter — format and beautify your SQL queries
- SQL Playground — test SQL queries interactively in the browser
- JSON Formatter — format JSONB data for readability
- SQL Complete Guide — comprehensive guide to SQL fundamentals
- SQL Basics Cheat Sheet — quick reference for essential SQL commands
Frequently Asked Questions
What is the difference between PostgreSQL and MySQL?
PostgreSQL is more standards-compliant, supports advanced data types like JSONB and arrays, has stronger concurrency with MVCC, and offers features like CTEs, window functions, partial indexes, and full-text search out of the box. MySQL is generally simpler to set up, performs well for read-heavy workloads, and has a larger hosting ecosystem. PostgreSQL is preferred for complex queries, data integrity, and applications that need advanced features. MySQL is often chosen for simpler CRUD applications and when wide hosting support is required.
How do I use JSONB in PostgreSQL?
JSONB is a binary JSON data type in PostgreSQL that supports indexing and efficient querying. Store JSON with INSERT INTO table (data) VALUES ('{"key": "value"}'::jsonb). Query fields with the -> operator for JSON objects and ->> for text values: SELECT data->>'name' FROM table. Use @> for containment queries: WHERE data @> '{"status": "active"}'. Create GIN indexes on JSONB columns for fast lookups: CREATE INDEX idx ON table USING GIN (data). JSONB also supports jsonb_set() for updates, jsonb_array_elements() for array expansion, and the || operator for merging objects.
What types of indexes does PostgreSQL support?
PostgreSQL supports six index types: B-tree (default, for equality and range queries), Hash (for equality-only lookups), GIN (Generalized Inverted Index, for full-text search, JSONB, and arrays), GiST (Generalized Search Tree, for geometric data and range types), SP-GiST (for partitioned search trees like quad-trees), and BRIN (Block Range Index, for large naturally ordered tables). PostgreSQL also supports partial indexes (index a subset of rows), expression indexes (index computed values), covering indexes (INCLUDE clause), and unique indexes.
How do I optimize PostgreSQL query performance?
Start by running EXPLAIN ANALYZE on slow queries to see the execution plan. Look for sequential scans on large tables and add appropriate indexes. Use composite indexes for multi-column WHERE clauses, and partial indexes for frequently filtered subsets. Keep statistics updated with ANALYZE. Tune postgresql.conf settings like shared_buffers (25% of RAM), effective_cache_size (50-75% of RAM), and work_mem. Use connection pooling with PgBouncer. For large tables, consider partitioning. Monitor slow queries with pg_stat_statements and check for unused indexes with pg_stat_user_indexes.
How do I set up replication in PostgreSQL?
PostgreSQL supports streaming replication for high availability. On the primary server, set wal_level=replica, max_wal_senders=3, and configure pg_hba.conf to allow replication connections. On the standby, use pg_basebackup to create an initial copy, then configure primary_conninfo in postgresql.conf to point to the primary. PostgreSQL also supports logical replication for selective table replication using CREATE PUBLICATION and CREATE SUBSCRIPTION. For automatic failover, use tools like Patroni or repmgr.