Essential PostgreSQL commands, queries, and syntax for connecting, querying, managing tables, indexing, JSONB operations, and performance tuning. Bookmark this page for quick reference.
SELECT DISTINCT ON (col) * FROM table ORDER BY col, id;
Return first row per group (PG-specific)
INSERT, UPDATE, DELETE with RETURNING
-- Insert and return the new row
INSERT INTO users (name, email)
VALUES ('Alice', 'alice@example.com')
RETURNING id, name, created_at;
-- Update and return affected rows
UPDATE users SET name = 'Bob' WHERE id = 1
RETURNING *;
-- Delete and return deleted rows
DELETE FROM users WHERE active = false
RETURNING id, email;
-- Upsert (INSERT ... ON CONFLICT)
INSERT INTO users (email, name)
VALUES ('alice@example.com', 'Alice')
ON CONFLICT (email)
DO UPDATE SET name = EXCLUDED.name
RETURNING *;
JOINs
Query
Description
SELECT * FROM a INNER JOIN b ON a.id = b.a_id;
Return rows with matching values in both tables
SELECT * FROM a LEFT JOIN b ON a.id = b.a_id;
All rows from left table, matches from right (or NULL)
SELECT * FROM a RIGHT JOIN b ON a.id = b.a_id;
All rows from right table, matches from left (or NULL)
SELECT * FROM a FULL OUTER JOIN b ON a.id = b.a_id;
All rows from both tables, NULLs where no match
SELECT * FROM a CROSS JOIN b;
Cartesian product of both tables
SELECT * FROM a NATURAL JOIN b;
Join on all columns with the same name (use with caution)
SELECT e.name, m.name AS manager FROM employees e LEFT JOIN employees m ON e.manager_id = m.id;
Self join — join a table to itself using aliases
SELECT * FROM a JOIN LATERAL ( SELECT * FROM b WHERE b.a_id = a.id LIMIT 3 ) sub ON true;
LATERAL join — subquery can reference earlier FROM items
Aggregate Functions
Query
Description
SELECT COUNT(*) FROM table;
Count total rows
SELECT COUNT(DISTINCT col) FROM table;
Count unique non-NULL values
SELECT SUM(amount) FROM orders;
Sum of a numeric column
SELECT AVG(price) FROM products;
Average of a numeric column
SELECT MIN(col), MAX(col) FROM table;
Minimum and maximum values
SELECT department, COUNT(*) FROM employees GROUP BY department;
Count rows per group
GROUP BY department HAVING COUNT(*) > 5
Filter groups after aggregation
SELECT ARRAY_AGG(name ORDER BY name) FROM users GROUP BY team;
Aggregate values into a sorted array
SELECT STRING_AGG(tag, ', ') FROM post_tags GROUP BY post_id;
Concatenate strings with a separator
SELECT JSON_AGG(row_to_json(t)) FROM my_table t;
Aggregate rows into a JSON array
SELECT BOOL_AND(active) FROM users;
TRUE if all values are true
SELECT BOOL_OR(active) FROM users;
TRUE if any value is true
Window Functions
Query
Description
ROW_NUMBER() OVER (ORDER BY col)
Assign a unique sequential number to each row
ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC)
Row number within each partition (group)
RANK() OVER (ORDER BY score DESC)
Rank with gaps for ties (1, 2, 2, 4)
DENSE_RANK() OVER (ORDER BY score DESC)
Rank without gaps for ties (1, 2, 2, 3)
NTILE(4) OVER (ORDER BY score)
Distribute rows into N roughly equal buckets
LAG(col, 1) OVER (ORDER BY date)
Access the value from the previous row
LEAD(col, 1) OVER (ORDER BY date)
Access the value from the next row
FIRST_VALUE(col) OVER (PARTITION BY grp ORDER BY date)
First value in the window frame
LAST_VALUE(col) OVER ( PARTITION BY grp ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING )
Last value in the window frame (must set frame)
SUM(amount) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
Running total (cumulative sum)
AVG(amount) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)
7-day moving average
PERCENT_RANK() OVER (ORDER BY score)
Relative rank as a percentage (0 to 1)
Window Function Example
-- Top 3 highest-paid employees per department
SELECT * FROM (
SELECT
name, department, salary,
ROW_NUMBER() OVER (
PARTITION BY department
ORDER BY salary DESC
) AS rank
FROM employees
) ranked
WHERE rank <= 3;
Indexes
Command
Description
CREATE INDEX idx_name ON table(col);
Create a B-tree index (default type)
CREATE UNIQUE INDEX idx ON table(col);
Create a unique index
CREATE INDEX idx ON table(col1, col2);
Composite index on multiple columns
CREATE INDEX idx ON table(col) WHERE active = true;
Partial index — only indexes rows matching condition
CREATE INDEX idx ON table USING GIN(col);
GIN index — for arrays, JSONB, full-text search
CREATE INDEX idx ON table USING GiST(col);
GiST index — for geometric data, range types, full-text
CREATE INDEX idx ON table USING BRIN(col);
BRIN index — for large tables with naturally ordered data
CREATE INDEX idx ON table USING HASH(col);
Hash index — for equality-only comparisons
CREATE INDEX idx ON table(col DESC NULLS LAST);
Index with custom sort order
CREATE INDEX CONCURRENTLY idx ON table(col);
Create index without locking the table (no downtime)
CREATE INDEX idx ON table((lower(col)));
Expression index — index on a function result
CREATE INDEX idx ON table USING GIN(col gin_trgm_ops);
Trigram index for fast LIKE/ILIKE queries (needs pg_trgm)
DROP INDEX idx;
Remove an index
DROP INDEX CONCURRENTLY idx;
Remove an index without locking
REINDEX INDEX idx;
Rebuild a corrupted or bloated index
SELECT pg_size_pretty(pg_indexes_size('table'));
Check total index size for a table
JSONB Operations
Query
Description
col->'key'
Get JSON object field as JSON
col->>'key'
Get JSON object field as text
col->0
Get JSON array element by index (as JSON)
col#>'{a,b}'
Get value at nested path as JSON
col#>>'{a,b}'
Get value at nested path as text
col @> '{"key": "value"}'
Contains operator — does left contain right?
col <@ '{"key": "value"}'
Contained by operator — is left contained in right?
col ? 'key'
Does the key exist at the top level?
col ?| array['a','b']
Does any of these keys exist?
col ?& array['a','b']
Do all of these keys exist?
col || '{"new": "data"}'
Concatenate/merge JSONB values
col - 'key'
Delete a key from JSONB object
col #- '{a,b}'
Delete value at a nested path
jsonb_set(col, '{key}', '"value"')
Set a value at a specific path
jsonb_strip_nulls(col)
Remove all null-valued keys
jsonb_each(col)
Expand top-level JSON object to key-value rows
jsonb_array_elements(col)
Expand a JSON array to a set of rows
jsonb_object_keys(col)
Return set of top-level keys
jsonb_pretty(col)
Format JSONB as indented text (for display)
to_jsonb(record)
Convert a row or value to JSONB
JSONB Query Examples
-- Find users where metadata contains a specific key-value
SELECT * FROM users WHERE metadata @> '{"role": "admin"}';
-- Extract nested value
SELECT metadata#>>'{address,city}' AS city FROM users;
-- Update a nested JSONB field
UPDATE users
SET metadata = jsonb_set(metadata, '{theme}', '"dark"')
WHERE id = 1;
-- Index for fast JSONB containment queries
CREATE INDEX idx_meta ON users USING GIN(metadata);
-- Expand array elements and join
SELECT u.name, elem->>'skill' AS skill
FROM users u,
jsonb_array_elements(u.metadata->'skills') AS elem;
Common Table Expressions (CTEs)
Query
Description
WITH cte AS ( SELECT ... FROM table ) SELECT * FROM cte;
Basic CTE — named temporary result set
WITH a AS (...), b AS (...) SELECT * FROM a JOIN b ON ...;
Multiple CTEs in a single query
WITH RECURSIVE tree AS ( SELECT id, name, parent_id, 0 AS depth FROM categories WHERE parent_id IS NULL UNION ALL SELECT c.id, c.name, c.parent_id, t.depth + 1 FROM categories c JOIN tree t ON c.parent_id = t.id ) SELECT * FROM tree;
Recursive CTE — traverse hierarchical data (tree/graph)
WITH deleted AS ( DELETE FROM logs WHERE created_at < NOW() - INTERVAL '90 days' RETURNING * ) SELECT COUNT(*) FROM deleted;
Writable CTE — use DML statements inside CTEs
Recursive CTE Example: Running Total
-- Generate a date series with running totals
WITH daily_sales AS (
SELECT date_trunc('day', created_at)::date AS day,
SUM(amount) AS total
FROM orders
GROUP BY 1
)
SELECT day, total,
SUM(total) OVER (ORDER BY day) AS running_total
FROM daily_sales
ORDER BY day;
Transactions
Command
Description
BEGIN;
Start a transaction
COMMIT;
Save all changes made in the transaction
ROLLBACK;
Undo all changes made in the transaction
SAVEPOINT sp1;
Create a savepoint within a transaction
ROLLBACK TO sp1;
Roll back to a savepoint (partial undo)
RELEASE SAVEPOINT sp1;
Destroy a savepoint (keep changes)
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
Set isolation level for the current transaction
SELECT * FROM table FOR UPDATE;
Lock selected rows until transaction ends
SELECT * FROM table FOR UPDATE SKIP LOCKED;
Skip already-locked rows (queue pattern)
SELECT * FROM table FOR SHARE;
Shared lock — prevent updates but allow reads
LOCK TABLE t IN EXCLUSIVE MODE;
Explicitly lock an entire table
Transaction Example
BEGIN;
-- Transfer $100 from account A to account B
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
-- Verify the transfer
SELECT id, balance FROM accounts WHERE id IN (1, 2);
-- If everything looks correct
COMMIT;
-- If something went wrong
-- ROLLBACK;
User/Role Management
Command
Description
CREATE ROLE myuser LOGIN PASSWORD 'secret';
Create a new user (role with login)
CREATE ROLE myrole;
Create a role without login ability
CREATE ROLE admin SUPERUSER LOGIN PASSWORD 'pass';
Create a superuser role
ALTER ROLE myuser WITH PASSWORD 'newpass';
Change a user's password
ALTER ROLE myuser CREATEDB;
Grant ability to create databases
DROP ROLE myuser;
Delete a role
GRANT SELECT ON table TO myuser;
Grant read access on a table
GRANT ALL PRIVILEGES ON table TO myuser;
Grant full access on a table
GRANT SELECT ON ALL TABLES IN SCHEMA public TO myuser;
Grant read access to all tables in a schema
GRANT myrole TO myuser;
Add a user to a role (role inheritance)
REVOKE INSERT ON table FROM myuser;
Remove a specific privilege
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO myuser;
Auto-grant privileges on future tables
SELECT * FROM pg_roles;
List all roles and their attributes
Backup & Restore
Command
Description
pg_dump mydb > backup.sql
Dump a database to a SQL file
pg_dump -Fc mydb > backup.dump
Dump in custom (compressed) format
pg_dump -t tablename mydb > table.sql
Dump a single table
pg_dump --schema-only mydb > schema.sql
Dump only the schema (no data)
pg_dump --data-only mydb > data.sql
Dump only the data (no schema)
pg_dump -Fd mydb -j 4 -f backup_dir
Parallel dump to a directory (4 jobs)
pg_dumpall > all_databases.sql
Dump all databases including roles
psql mydb < backup.sql
Restore from a SQL dump
pg_restore -d mydb backup.dump
Restore from a custom-format dump
pg_restore -d mydb -j 4 backup_dir
Parallel restore from a directory dump
pg_restore -l backup.dump
List contents of a dump file
pg_restore --clean -d mydb backup.dump
Drop objects before restoring
Performance (EXPLAIN ANALYZE)
Command
Description
EXPLAIN SELECT * FROM table WHERE ...;
Show the query execution plan (estimated costs)
EXPLAIN ANALYZE SELECT * FROM table WHERE ...;
Execute the query and show actual timing and row counts
EXPLAIN (ANALYZE, BUFFERS) SELECT ...;
Show buffer usage (cache hits vs disk reads)
EXPLAIN (ANALYZE, FORMAT JSON) SELECT ...;
Output the plan in JSON format (for tools)
EXPLAIN (ANALYZE, VERBOSE) SELECT ...;
Verbose output with column details
ANALYZE table;
Update table statistics for the query planner
VACUUM table;
Reclaim storage from dead rows
VACUUM FULL table;
Full vacuum — rewrites the table (locks table!)
VACUUM ANALYZE table;
Vacuum and update statistics in one command
Reading EXPLAIN Output
EXPLAIN ANALYZE
SELECT u.name, COUNT(o.id)
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.active = true
GROUP BY u.name;
-- Key things to look for:
-- Seq Scan = Full table scan (may need an index)
-- Index Scan = Using an index (good)
-- Bitmap Scan = Combining multiple indexes
-- Hash Join = In-memory hash table join
-- Nested Loop = Row-by-row join (watch for large tables)
-- Sort = External sort (check work_mem)
-- actual time = Real execution time in milliseconds
-- rows = Actual vs estimated row counts
-- Buffers: shared hit = Read from cache (fast)
-- Buffers: shared read = Read from disk (slow)
Useful Performance Queries
-- Find the slowest queries (requires pg_stat_statements)
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;
-- Check table sizes
SELECT relname, pg_size_pretty(pg_total_relation_size(relid))
FROM pg_catalog.pg_statio_user_tables
ORDER BY pg_total_relation_size(relid) DESC;
-- Find unused indexes
SELECT indexrelid::regclass AS index,
relid::regclass AS table,
idx_scan AS index_scans
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;
-- Check active connections
SELECT pid, usename, application_name, state, query
FROM pg_stat_activity
WHERE state != 'idle'
ORDER BY query_start;
Frequently Asked Questions
What is the difference between PostgreSQL and MySQL?
PostgreSQL is a fully ACID-compliant, object-relational database that supports advanced features like JSONB, window functions, CTEs, full-text search, and custom types out of the box. MySQL is simpler to set up and historically faster for read-heavy workloads, but PostgreSQL excels at complex queries, data integrity, and extensibility. PostgreSQL also has better standards compliance and supports advanced indexing methods like GIN, GiST, and BRIN.
How do I connect to a PostgreSQL database from the command line?
Use the psql command-line client: psql -h hostname -p 5432 -U username -d database_name. You can also use a connection string: psql postgresql://user:password@host:5432/dbname. Once connected, use backslash commands like \dt to list tables, \d table_name to describe a table, and \q to quit. Set the PGPASSWORD environment variable or use a .pgpass file to avoid typing the password each time.
How do I improve query performance in PostgreSQL?
Start by running EXPLAIN ANALYZE on slow queries to see the execution plan and actual timing. Create indexes on columns used in WHERE, JOIN, and ORDER BY clauses. Use partial indexes for filtered queries and composite indexes for multi-column conditions. Ensure the query planner has up-to-date statistics by running ANALYZE. Avoid SELECT * in favor of specific columns, use connection pooling (PgBouncer), and consider partitioning large tables. Monitor with pg_stat_statements to find your slowest queries.