Elasticsearch: The Complete Guide for 2026

Published February 12, 2026 · 28 min read

Elasticsearch is a distributed search and analytics engine built on Apache Lucene. It stores JSON documents, indexes every field automatically, and returns search results in milliseconds — even across billions of documents. From powering site search at Wikipedia and GitHub to processing petabytes of logs at Netflix and Uber, Elasticsearch is the industry standard for full-text search, log analytics, and real-time aggregations.

This guide covers Elasticsearch from core concepts through production deployment: indexes, mappings, every query type you will use, aggregations, analyzers, performance tuning, Kibana integration, security, and how it compares to alternatives.

⚙ Related: Format your Elasticsearch JSON responses with our JSON Formatter and deploy Elasticsearch in containers with Docker Compose.

Table of Contents

  1. What is Elasticsearch
  2. Core Concepts
  3. Setting Up Elasticsearch
  4. CRUD Operations
  5. Search Queries
  6. Aggregations
  7. Mappings and Analyzers
  8. Full-Text Search Best Practices
  9. Index Templates and Lifecycle
  10. Performance Tuning
  11. Elasticsearch with Kibana
  12. Security
  13. Common Pitfalls and Troubleshooting
  14. Elasticsearch vs Alternatives
  15. Frequently Asked Questions

1. What is Elasticsearch

Elasticsearch is an open-source, distributed, RESTful search engine. You interact with it entirely through HTTP endpoints that accept and return JSON. Unlike traditional databases optimized for transactional writes, Elasticsearch is optimized for search: it builds inverted indexes on your data so full-text queries, filtering, and aggregations run in milliseconds.

When to use Elasticsearch:

2. Core Concepts

3. Setting Up Elasticsearch

Docker (Recommended for Development)

# Single node for development
docker run -d --name elasticsearch \
  -p 9200:9200 -p 9300:9300 \
  -e "discovery.type=single-node" \
  -e "xpack.security.enabled=false" \
  -e "ES_JAVA_OPTS=-Xms512m -Xmx512m" \
  -v es-data:/usr/share/elasticsearch/data \
  docker.elastic.co/elasticsearch/elasticsearch:8.17.0

# Verify it is running
curl http://localhost:9200

Install on Ubuntu/Debian

# Import the Elastic GPG key and add the repository
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | \
  sudo gpg --dearmor -o /usr/share/keyrings/elastic.gpg
echo "deb [signed-by=/usr/share/keyrings/elastic.gpg] \
  https://artifacts.elastic.co/packages/8.x/apt stable main" | \
  sudo tee /etc/apt/sources.list.d/elastic-8.x.list

sudo apt update && sudo apt install elasticsearch
sudo systemctl enable elasticsearch && sudo systemctl start elasticsearch
curl -k https://localhost:9200

4. CRUD Operations

Elasticsearch uses a RESTful API. All operations are HTTP requests with JSON bodies.

Create an Index

# Create an index with settings
PUT /products
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1
  }
}

Index (Create/Update) a Document

# Index a document with an explicit ID
PUT /products/_doc/1
{
  "name": "Wireless Keyboard",
  "category": "electronics",
  "price": 49.99,
  "in_stock": true,
  "created_at": "2026-02-12"
}

# Auto-generate an ID
POST /products/_doc
{
  "name": "USB-C Hub",
  "category": "electronics",
  "price": 29.99,
  "in_stock": true
}

Read a Document

# Get a document by ID
GET /products/_doc/1

# Get only specific fields
GET /products/_doc/1?_source_includes=name,price

Update a Document

# Partial update (merges fields)
POST /products/_update/1
{
  "doc": { "price": 44.99, "in_stock": false }
}

# Update with script
POST /products/_update/1
{
  "script": {
    "source": "ctx._source.price -= params.discount",
    "params": { "discount": 5 }
  }
}

Delete a Document

# Delete by ID
DELETE /products/_doc/1

# Delete by query
POST /products/_delete_by_query
{
  "query": {
    "term": { "in_stock": false }
  }
}

5. Search Queries

Elasticsearch queries fall into two categories: full-text queries that analyze the search term and score results by relevance, and term-level queries that look for exact values without analysis.

Match Query (Full-Text)

# Search for documents containing "wireless keyboard"
GET /products/_search
{
  "query": {
    "match": {
      "name": "wireless keyboard"
    }
  }
}
# Matches "Wireless Keyboard", "keyboard wireless", "wireless gaming keyboard"

Multi-Match Query

# Search across multiple fields
GET /products/_search
{
  "query": {
    "multi_match": {
      "query": "wireless keyboard",
      "fields": ["name^3", "description", "category"],
      "type": "best_fields"
    }
  }
}
# ^3 boosts matches in "name" by 3x

Bool Query (Combine Conditions)

# Combine multiple conditions
GET /products/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "name": "keyboard" } }
      ],
      "filter": [
        { "term": { "category": "electronics" } },
        { "range": { "price": { "gte": 20, "lte": 100 } } }
      ],
      "should": [
        { "term": { "in_stock": true } }
      ],
      "must_not": [
        { "term": { "category": "refurbished" } }
      ]
    }
  }
}
# must: required, contributes to score
# filter: required, does NOT contribute to score (faster, cached)
# should: optional, boosts score if matched
# must_not: excludes documents

Term Query (Exact Match)

# Exact match on keyword fields (no analysis)
GET /products/_search
{
  "query": {
    "term": { "category": "electronics" }
  }
}

# Multiple exact values
GET /products/_search
{
  "query": {
    "terms": { "category": ["electronics", "accessories"] }
  }
}

Range Query

# Numeric range
GET /products/_search
{
  "query": {
    "range": {
      "price": { "gte": 10, "lte": 50 }
    }
  }
}

# Date range
GET /logs/_search
{
  "query": {
    "range": {
      "timestamp": {
        "gte": "2026-02-01",
        "lte": "2026-02-12",
        "format": "yyyy-MM-dd"
      }
    }
  }
}

Wildcard and Prefix Queries

# Wildcard: * matches any characters, ? matches one
GET /products/_search
{
  "query": {
    "wildcard": { "name": "key*" }
  }
}

# Prefix: faster than wildcard for starts-with
GET /products/_search
{
  "query": {
    "prefix": { "name.keyword": "Wire" }
  }
}

Pagination and Sorting

# Paginate results
GET /products/_search
{
  "query": { "match_all": {} },
  "from": 0,
  "size": 20,
  "sort": [
    { "price": "asc" },
    { "_score": "desc" }
  ],
  "_source": ["name", "price", "category"]
}

6. Aggregations

Aggregations compute analytics over your data — counts, averages, histograms, and nested breakdowns.

Terms Aggregation (Group By)

# Count products per category
GET /products/_search
{
  "size": 0,
  "aggs": {
    "categories": {
      "terms": { "field": "category", "size": 20 }
    }
  }
}
# Returns: { "electronics": 150, "accessories": 89, ... }

Metric Aggregations

# Average, min, max, sum
GET /products/_search
{
  "size": 0,
  "aggs": {
    "avg_price": { "avg": { "field": "price" } },
    "max_price": { "max": { "field": "price" } },
    "total_revenue": { "sum": { "field": "price" } },
    "price_stats": { "stats": { "field": "price" } }
  }
}

Date Histogram

# Orders per day over the last month
GET /orders/_search
{
  "size": 0,
  "query": {
    "range": { "created_at": { "gte": "now-30d" } }
  },
  "aggs": {
    "orders_per_day": {
      "date_histogram": {
        "field": "created_at",
        "calendar_interval": "day"
      },
      "aggs": {
        "daily_revenue": { "sum": { "field": "total" } }
      }
    }
  }
}

Nested Aggregations

# Average price per category with top products
GET /products/_search
{
  "size": 0,
  "aggs": {
    "by_category": {
      "terms": { "field": "category", "size": 10 },
      "aggs": {
        "avg_price": { "avg": { "field": "price" } }
      }
    }
  }
}

7. Mappings and Analyzers

Mappings define field types. Getting mappings right is critical — you cannot change a field's type after the index is created without reindexing.

Common Field Types

PUT /products
{
  "mappings": {
    "properties": {
      "name":        { "type": "text", "analyzer": "standard" },
      "category":    { "type": "keyword" },
      "description": { "type": "text" },
      "price":       { "type": "float" },
      "in_stock":    { "type": "boolean" },
      "created_at":  { "type": "date", "format": "yyyy-MM-dd" },
      "tags":        { "type": "keyword" },
      "location":    { "type": "geo_point" }
    }
  }
}
# text: analyzed for full-text search (tokenized, lowercased, stemmed)
# keyword: exact values for filtering, sorting, aggregations
# Use both when you need search AND filtering on the same field

Multi-Field Mapping

# Map a field as both text and keyword
"name": {
  "type": "text",
  "fields": {
    "keyword": { "type": "keyword", "ignore_above": 256 }
  }
}
# Search: match on "name" (analyzed)
# Sort/aggregate: use "name.keyword" (exact)

Custom Analyzers

PUT /articles
{
  "settings": {
    "analysis": {
      "analyzer": {
        "custom_english": {
          "type": "custom", "tokenizer": "standard",
          "filter": ["lowercase", "english_stop", "english_stemmer"]
        }
      },
      "filter": {
        "english_stop":    { "type": "stop", "stopwords": "_english_" },
        "english_stemmer": { "type": "stemmer", "language": "english" }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": { "type": "text", "analyzer": "custom_english" },
      "body":  { "type": "text", "analyzer": "custom_english" }
    }
  }
}

# Test your analyzer
POST /articles/_analyze
{ "analyzer": "custom_english", "text": "The quick brown foxes are jumping" }
# Tokens: ["quick", "brown", "fox", "jump"]

8. Full-Text Search Best Practices

9. Index Templates and Lifecycle Management

Index Templates

# Create a template for all log indexes
PUT /_index_template/logs_template
{
  "index_patterns": ["logs-*"],
  "template": {
    "settings": {
      "number_of_shards": 2,
      "number_of_replicas": 1,
      "index.lifecycle.name": "logs_policy"
    },
    "mappings": {
      "properties": {
        "@timestamp": { "type": "date" },
        "message":    { "type": "text" },
        "level":      { "type": "keyword" },
        "service":    { "type": "keyword" }
      }
    }
  },
  "priority": 100
}
# Any new index matching "logs-*" inherits these settings

Index Lifecycle Management (ILM)

# Define a lifecycle policy: hot -> warm -> cold -> delete
PUT /_ilm/policy/logs_policy
{
  "policy": {
    "phases": {
      "hot":  { "actions": { "rollover": { "max_size": "50gb", "max_age": "7d" } } },
      "warm": { "min_age": "7d", "actions": { "shrink": { "number_of_shards": 1 }, "forcemerge": { "max_num_segments": 1 } } },
      "cold": { "min_age": "30d", "actions": { "freeze": {} } },
      "delete": { "min_age": "90d", "actions": { "delete": {} } }
    }
  }
}

Reindexing

# Reindex data from one index to another (useful for mapping changes)
POST /_reindex
{
  "source": { "index": "products_v1" },
  "dest":   { "index": "products_v2" }
}

# Use aliases for zero-downtime reindexing
POST /_aliases
{
  "actions": [
    { "remove": { "index": "products_v1", "alias": "products" } },
    { "add":    { "index": "products_v2", "alias": "products" } }
  ]
}

10. Performance Tuning

Bulk Operations

# Bulk API: index, update, or delete many documents in one request
POST /_bulk
{"index": {"_index": "products", "_id": "1"}}
{"name": "Keyboard", "price": 49.99, "category": "electronics"}
{"index": {"_index": "products", "_id": "2"}}
{"name": "Mouse", "price": 29.99, "category": "electronics"}
{"delete": {"_index": "products", "_id": "3"}}

# Always use bulk for batch operations
# Optimal batch size: 5-15 MB per request, or 1000-5000 documents

Refresh Interval

# Documents are not searchable until a refresh (default: 1 second)
# For heavy indexing, increase the interval
PUT /products/_settings
{
  "index.refresh_interval": "30s"
}

# Disable refresh during bulk loading, re-enable after
PUT /products/_settings
{ "index.refresh_interval": "-1" }
# ... bulk index millions of documents ...
PUT /products/_settings
{ "index.refresh_interval": "1s" }
POST /products/_refresh

JVM Heap and System Settings

# Set heap to 50% of available RAM, max 31 GB
# In jvm.options or ES_JAVA_OPTS:
-Xms16g
-Xmx16g
# Always set Xms and Xmx to the same value (avoid resizing)

# System settings for production (in /etc/sysctl.conf):
vm.max_map_count=262144
vm.swappiness=1

# File descriptor limit (in /etc/security/limits.conf):
elasticsearch  -  nofile  65535

Performance Checklist

11. Elasticsearch with Kibana

Kibana is the official visualization platform for Elasticsearch. It provides a web UI for querying, building dashboards, and managing your cluster.

# Create a data view (formerly index pattern) in Kibana:
# 1. Go to Stack Management > Data Views
# 2. Create data view: "logs-*"
# 3. Set @timestamp as the time field
# 4. Go to Discover to explore your data

# Kibana also supports Canvas (pixel-perfect reports),
# Lens (drag-and-drop visualizations), and Maps (geospatial data)

12. Security

Elasticsearch 8.x enables security by default. Always configure authentication, TLS, and role-based access in production.

Authentication

# Reset the elastic superuser password
bin/elasticsearch-reset-password -u elastic

# Create a user with a specific role
POST /_security/user/app_reader
{
  "password": "secure_password_here",
  "roles": ["reader_role"],
  "full_name": "Application Reader"
}

# Connect with credentials
curl -u elastic:your_password https://localhost:9200

TLS/SSL and Role-Based Access

# Generate certificates (built-in tool)
bin/elasticsearch-certutil ca
bin/elasticsearch-certutil cert --ca elastic-stack-ca.p12

# elasticsearch.yml — enable transport and HTTP TLS
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.keystore.path: elastic-certificates.p12
xpack.security.http.ssl.enabled: true
xpack.security.http.ssl.keystore.path: http.p12

# Create a read-only role
POST /_security/role/reader_role
{
  "indices": [{
    "names": ["products*", "logs-*"],
    "privileges": ["read", "view_index_metadata"]
  }]
}

# Create a writer role
POST /_security/role/writer_role
{
  "indices": [{
    "names": ["products"],
    "privileges": ["write", "create_index", "read"]
  }]
}

13. Common Pitfalls and Troubleshooting

# Check cluster health
GET /_cluster/health

# See unassigned shards
GET /_cat/shards?v&s=state&h=index,shard,state,unassigned.reason

# Check node resource usage
GET /_cat/nodes?v&h=name,heap.percent,ram.percent,cpu,disk.used_percent

# Find slow queries in the slow log
PUT /products/_settings
{
  "index.search.slowlog.threshold.query.warn": "5s",
  "index.search.slowlog.threshold.query.info": "2s"
}

14. Elasticsearch vs Alternatives

When to choose Elasticsearch: you need full-text search at scale, complex aggregations, log analytics, the ELK ecosystem, or geospatial queries. When to choose an alternative: you need a simple search box for a small dataset (Meilisearch/Typesense) or want a purely open-source license (OpenSearch).

Frequently Asked Questions

What is the difference between an Elasticsearch index and a database table?

An Elasticsearch index is roughly analogous to a database table, but stores JSON documents instead of rows, uses mappings instead of a fixed schema, and automatically indexes every field for full-text search. Indexes are distributed across shards for horizontal scaling, and documents do not need identical fields, giving you schema flexibility.

How many shards should I configure for an Elasticsearch index?

A good starting point is one shard per 10-50 GB of data. Each shard consumes memory and file descriptors, so too many small shards waste resources. For indexes under 10 GB, a single shard is usually sufficient. Keep total shards under 20 per GB of heap memory across the cluster.

When should I use Elasticsearch instead of a relational database?

Use Elasticsearch when you need full-text search with relevance scoring, fuzzy matching, or autocomplete. It excels at log analytics, searching millions of documents, real-time aggregations, and geospatial queries. Do not use it as a primary database for transactional data requiring ACID guarantees or complex joins. The most common pattern is running Elasticsearch alongside a relational database.

How does Elasticsearch handle full-text search differently from SQL LIKE?

SQL LIKE performs pattern matching on raw text and cannot use indexes efficiently. Elasticsearch uses inverted indexes: text is tokenized, lowercased, and stemmed during ingestion, so a search for "running" matches "run", "runs", and "running" automatically. Results are scored by relevance using BM25, making full-text search orders of magnitude faster and more useful than LIKE queries.

What is the ELK Stack and how do the components work together?

The ELK Stack consists of Elasticsearch (search and storage), Logstash (data ingestion and transformation), and Kibana (visualization and dashboards). Beats are lightweight data shippers that send logs, metrics, and network data to the stack. Together, they form a complete observability and search platform used by thousands of organizations for log management and analytics.

Conclusion

Elasticsearch is the most powerful search and analytics engine available today. Start with a single-node Docker setup, define explicit mappings for your indexes, and use bool queries with filter context for fast, relevant search. As your data grows, leverage index lifecycle management to automate data retention, bulk operations for efficient indexing, and the Kibana ecosystem for visualization.

For production, always enable security (TLS + authentication), right-size your shards (10-50 GB each), set JVM heap to half your RAM (max 31 GB), and use aliases with reindexing for zero-downtime schema changes.

⚙ Related: Format JSON with our JSON Formatter, deploy with Docker Compose, and store persistent data in PostgreSQL.

Learn More

Related Resources

JSON Formatter
Format and validate Elasticsearch JSON queries and responses
Docker Compose Guide
Deploy Elasticsearch and Kibana in multi-container stacks
PostgreSQL Complete Guide
Relational database as your source of truth alongside Elasticsearch
Redis Complete Guide
Caching layer to complement Elasticsearch for fast data access