Elasticsearch: The Complete Guide for 2026

Q: What is the ELK Stack and how do the components work together?

The ELK Stack consists of Elasticsearch, Logstash, and Kibana. Elasticsearch is the search and analytics engine that stores and indexes data. Logstash is a data processing pipeline that ingests data from multiple sources, transforms it with filters, and sends it to Elasticsearch. Kibana is the visualization layer that provides dashboards, charts, and a query interface for exploring data in Elasticsearch. Many teams also use Beats, lightweight data shippers that send logs, metrics, and network data directly to Elasticsearch or Logstash. Together, these tools form a complete observability and search platform.

Published February 12, 2026 · 28 min read

Elasticsearch is a distributed search and analytics engine built on Apache Lucene. It stores JSON documents, indexes every field automatically, and returns search results in milliseconds — even across billions of documents. From powering site search at Wikipedia and GitHub to processing petabytes of logs at Netflix and Uber, Elasticsearch is the industry standard for full-text search, log analytics, and real-time aggregations.

This guide covers Elasticsearch from core concepts through production deployment: indexes, mappings, every query type you will use, aggregations, analyzers, performance tuning, Kibana integration, security, and how it compares to alternatives.

⚙ Related: Format your Elasticsearch JSON responses with our JSON Formatter and deploy Elasticsearch in containers with Docker Compose.

What is Elasticsearch
Core Concepts
Setting Up Elasticsearch
CRUD Operations
Search Queries
Aggregations
Mappings and Analyzers
Full-Text Search Best Practices
Index Templates and Lifecycle
Performance Tuning
Elasticsearch with Kibana
Security
Common Pitfalls and Troubleshooting
Elasticsearch vs Alternatives
Frequently Asked Questions

1. What is Elasticsearch

Elasticsearch is an open-source, distributed, RESTful search engine. You interact with it entirely through HTTP endpoints that accept and return JSON. Unlike traditional databases optimized for transactional writes, Elasticsearch is optimized for search: it builds inverted indexes on your data so full-text queries, filtering, and aggregations run in milliseconds.

When to use Elasticsearch:

Full-text search — product search, site search, document search with relevance ranking, autocomplete, and fuzzy matching
Log and event analytics — centralized logging with the ELK Stack (Elasticsearch, Logstash, Kibana) for real-time monitoring
Real-time aggregations — dashboards showing counts, averages, histograms, and trends across millions of records
Geospatial queries — find nearby locations, calculate distances, filter by bounding box
Application performance monitoring — store and query traces, metrics, and spans

2. Core Concepts

Index — a collection of related documents, similar to a database table. An index has a mapping that defines field types and analyzers.
Document — a JSON object stored in an index. Each document has a unique _id and is the basic unit of data.
Mapping — defines how fields are stored and indexed. Specifies field types (text, keyword, integer, date), analyzers, and whether fields are searchable.
Shard — an index is split into shards distributed across nodes. Each shard is a self-contained Lucene index. Sharding enables horizontal scaling.
Replica — a copy of a primary shard on a different node. Replicas provide fault tolerance and increase read throughput.
Node — a single Elasticsearch server instance. Nodes join a cluster and can hold data, coordinate queries, or serve as the master.
Cluster — one or more nodes working together, sharing data and load. A cluster has a single master node that manages index metadata.

3. Setting Up Elasticsearch

Docker (Recommended for Development)

# Single node for development
docker run -d --name elasticsearch \
  -p 9200:9200 -p 9300:9300 \
  -e "discovery.type=single-node" \
  -e "xpack.security.enabled=false" \
  -e "ES_JAVA_OPTS=-Xms512m -Xmx512m" \
  -v es-data:/usr/share/elasticsearch/data \
  docker.elastic.co/elasticsearch/elasticsearch:8.17.0

# Verify it is running
curl http://localhost:9200

Install on Ubuntu/Debian

# Import the Elastic GPG key and add the repository
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | \
  sudo gpg --dearmor -o /usr/share/keyrings/elastic.gpg
echo "deb [signed-by=/usr/share/keyrings/elastic.gpg] \
  https://artifacts.elastic.co/packages/8.x/apt stable main" | \
  sudo tee /etc/apt/sources.list.d/elastic-8.x.list

sudo apt update && sudo apt install elasticsearch
sudo systemctl enable elasticsearch && sudo systemctl start elasticsearch
curl -k https://localhost:9200

4. CRUD Operations

Elasticsearch uses a RESTful API. All operations are HTTP requests with JSON bodies.

Create an Index

# Create an index with settings
PUT /products
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1
  }
}

Index (Create/Update) a Document

# Index a document with an explicit ID
PUT /products/_doc/1
{
  "name": "Wireless Keyboard",
  "category": "electronics",
  "price": 49.99,
  "in_stock": true,
  "created_at": "2026-02-12"
}

# Auto-generate an ID
POST /products/_doc
{
  "name": "USB-C Hub",
  "category": "electronics",
  "price": 29.99,
  "in_stock": true
}

Read a Document

# Get a document by ID
GET /products/_doc/1

# Get only specific fields
GET /products/_doc/1?_source_includes=name,price

Update a Document

# Partial update (merges fields)
POST /products/_update/1
{
  "doc": { "price": 44.99, "in_stock": false }
}

# Update with script
POST /products/_update/1
{
  "script": {
    "source": "ctx._source.price -= params.discount",
    "params": { "discount": 5 }
  }
}

Delete a Document

# Delete by ID
DELETE /products/_doc/1

# Delete by query
POST /products/_delete_by_query
{
  "query": {
    "term": { "in_stock": false }
  }
}

5. Search Queries

Elasticsearch queries fall into two categories: full-text queries that analyze the search term and score results by relevance, and term-level queries that look for exact values without analysis.

Match Query (Full-Text)

# Search for documents containing "wireless keyboard"
GET /products/_search
{
  "query": {
    "match": {
      "name": "wireless keyboard"
    }
  }
}
# Matches "Wireless Keyboard", "keyboard wireless", "wireless gaming keyboard"

Multi-Match Query

# Search across multiple fields
GET /products/_search
{
  "query": {
    "multi_match": {
      "query": "wireless keyboard",
      "fields": ["name^3", "description", "category"],
      "type": "best_fields"
    }
  }
}
# ^3 boosts matches in "name" by 3x

Bool Query (Combine Conditions)

# Combine multiple conditions
GET /products/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "name": "keyboard" } }
      ],
      "filter": [
        { "term": { "category": "electronics" } },
        { "range": { "price": { "gte": 20, "lte": 100 } } }
      ],
      "should": [
        { "term": { "in_stock": true } }
      ],
      "must_not": [
        { "term": { "category": "refurbished" } }
      ]
    }
  }
}
# must: required, contributes to score
# filter: required, does NOT contribute to score (faster, cached)
# should: optional, boosts score if matched
# must_not: excludes documents

Term Query (Exact Match)

# Exact match on keyword fields (no analysis)
GET /products/_search
{
  "query": {
    "term": { "category": "electronics" }
  }
}

# Multiple exact values
GET /products/_search
{
  "query": {
    "terms": { "category": ["electronics", "accessories"] }
  }
}

Range Query

# Numeric range
GET /products/_search
{
  "query": {
    "range": {
      "price": { "gte": 10, "lte": 50 }
    }
  }
}

# Date range
GET /logs/_search
{
  "query": {
    "range": {
      "timestamp": {
        "gte": "2026-02-01",
        "lte": "2026-02-12",
        "format": "yyyy-MM-dd"
      }
    }
  }
}

Wildcard and Prefix Queries

# Wildcard: * matches any characters, ? matches one
GET /products/_search
{
  "query": {
    "wildcard": { "name": "key*" }
  }
}

# Prefix: faster than wildcard for starts-with
GET /products/_search
{
  "query": {
    "prefix": { "name.keyword": "Wire" }
  }
}

Pagination and Sorting

# Paginate results
GET /products/_search
{
  "query": { "match_all": {} },
  "from": 0,
  "size": 20,
  "sort": [
    { "price": "asc" },
    { "_score": "desc" }
  ],
  "_source": ["name", "price", "category"]
}

6. Aggregations

Aggregations compute analytics over your data — counts, averages, histograms, and nested breakdowns.

Terms Aggregation (Group By)

# Count products per category
GET /products/_search
{
  "size": 0,
  "aggs": {
    "categories": {
      "terms": { "field": "category", "size": 20 }
    }
  }
}
# Returns: { "electronics": 150, "accessories": 89, ... }

Metric Aggregations

# Average, min, max, sum
GET /products/_search
{
  "size": 0,
  "aggs": {
    "avg_price": { "avg": { "field": "price" } },
    "max_price": { "max": { "field": "price" } },
    "total_revenue": { "sum": { "field": "price" } },
    "price_stats": { "stats": { "field": "price" } }
  }
}

Date Histogram

# Orders per day over the last month
GET /orders/_search
{
  "size": 0,
  "query": {
    "range": { "created_at": { "gte": "now-30d" } }
  },
  "aggs": {
    "orders_per_day": {
      "date_histogram": {
        "field": "created_at",
        "calendar_interval": "day"
      },
      "aggs": {
        "daily_revenue": { "sum": { "field": "total" } }
      }
    }
  }
}

Nested Aggregations

# Average price per category with top products
GET /products/_search
{
  "size": 0,
  "aggs": {
    "by_category": {
      "terms": { "field": "category", "size": 10 },
      "aggs": {
        "avg_price": { "avg": { "field": "price" } }
      }
    }
  }
}

7. Mappings and Analyzers

Mappings define field types. Getting mappings right is critical — you cannot change a field's type after the index is created without reindexing.

Common Field Types

PUT /products
{
  "mappings": {
    "properties": {
      "name":        { "type": "text", "analyzer": "standard" },
      "category":    { "type": "keyword" },
      "description": { "type": "text" },
      "price":       { "type": "float" },
      "in_stock":    { "type": "boolean" },
      "created_at":  { "type": "date", "format": "yyyy-MM-dd" },
      "tags":        { "type": "keyword" },
      "location":    { "type": "geo_point" }
    }
  }
}
# text: analyzed for full-text search (tokenized, lowercased, stemmed)
# keyword: exact values for filtering, sorting, aggregations
# Use both when you need search AND filtering on the same field

Multi-Field Mapping

# Map a field as both text and keyword
"name": {
  "type": "text",
  "fields": {
    "keyword": { "type": "keyword", "ignore_above": 256 }
  }
}
# Search: match on "name" (analyzed)
# Sort/aggregate: use "name.keyword" (exact)

Custom Analyzers

PUT /articles
{
  "settings": {
    "analysis": {
      "analyzer": {
        "custom_english": {
          "type": "custom", "tokenizer": "standard",
          "filter": ["lowercase", "english_stop", "english_stemmer"]
        }
      },
      "filter": {
        "english_stop":    { "type": "stop", "stopwords": "_english_" },
        "english_stemmer": { "type": "stemmer", "language": "english" }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": { "type": "text", "analyzer": "custom_english" },
      "body":  { "type": "text", "analyzer": "custom_english" }
    }
  }
}

# Test your analyzer
POST /articles/_analyze
{ "analyzer": "custom_english", "text": "The quick brown foxes are jumping" }
# Tokens: ["quick", "brown", "fox", "jump"]

8. Full-Text Search Best Practices

Use text for searchable fields, keyword for exact match — searching on keyword fields requires exact case-sensitive matches, which is rarely what users expect.
Boost important fields — use "fields": ["title^3", "body"] in multi_match to weight title matches higher.
Use filter context for non-scoring conditions — filters are cached and skip scoring, making them significantly faster than must.
Choose the right analyzer — the standard analyzer works for most cases. Use language-specific analyzers for stemming (e.g., "english" turns "running" into "run").
Add synonyms — use a synonym filter so "laptop", "notebook", and "portable computer" all match.
Use match_phrase for exact phrase search — "quick brown fox" matches only when those words appear together in that order.
Implement autocomplete with edge_ngram — tokenize "elasticsearch" into "e", "el", "ela", ... for prefix-based suggestions.
Set index: false on fields you never search — saves disk space and speeds up indexing for fields used only for display.

9. Index Templates and Lifecycle Management

Index Templates

# Create a template for all log indexes
PUT /_index_template/logs_template
{
  "index_patterns": ["logs-*"],
  "template": {
    "settings": {
      "number_of_shards": 2,
      "number_of_replicas": 1,
      "index.lifecycle.name": "logs_policy"
    },
    "mappings": {
      "properties": {
        "@timestamp": { "type": "date" },
        "message":    { "type": "text" },
        "level":      { "type": "keyword" },
        "service":    { "type": "keyword" }
      }
    }
  },
  "priority": 100
}
# Any new index matching "logs-*" inherits these settings

Index Lifecycle Management (ILM)

# Define a lifecycle policy: hot -> warm -> cold -> delete
PUT /_ilm/policy/logs_policy
{
  "policy": {
    "phases": {
      "hot":  { "actions": { "rollover": { "max_size": "50gb", "max_age": "7d" } } },
      "warm": { "min_age": "7d", "actions": { "shrink": { "number_of_shards": 1 }, "forcemerge": { "max_num_segments": 1 } } },
      "cold": { "min_age": "30d", "actions": { "freeze": {} } },
      "delete": { "min_age": "90d", "actions": { "delete": {} } }
    }
  }
}

Reindexing

# Reindex data from one index to another (useful for mapping changes)
POST /_reindex
{
  "source": { "index": "products_v1" },
  "dest":   { "index": "products_v2" }
}

# Use aliases for zero-downtime reindexing
POST /_aliases
{
  "actions": [
    { "remove": { "index": "products_v1", "alias": "products" } },
    { "add":    { "index": "products_v2", "alias": "products" } }
  ]
}

10. Performance Tuning

Bulk Operations

# Bulk API: index, update, or delete many documents in one request
POST /_bulk
{"index": {"_index": "products", "_id": "1"}}
{"name": "Keyboard", "price": 49.99, "category": "electronics"}
{"index": {"_index": "products", "_id": "2"}}
{"name": "Mouse", "price": 29.99, "category": "electronics"}
{"delete": {"_index": "products", "_id": "3"}}

# Always use bulk for batch operations
# Optimal batch size: 5-15 MB per request, or 1000-5000 documents

Refresh Interval

# Documents are not searchable until a refresh (default: 1 second)
# For heavy indexing, increase the interval
PUT /products/_settings
{
  "index.refresh_interval": "30s"
}

# Disable refresh during bulk loading, re-enable after
PUT /products/_settings
{ "index.refresh_interval": "-1" }
# ... bulk index millions of documents ...
PUT /products/_settings
{ "index.refresh_interval": "1s" }
POST /products/_refresh

JVM Heap and System Settings

# Set heap to 50% of available RAM, max 31 GB
# In jvm.options or ES_JAVA_OPTS:
-Xms16g
-Xmx16g
# Always set Xms and Xmx to the same value (avoid resizing)

# System settings for production (in /etc/sysctl.conf):
vm.max_map_count=262144
vm.swappiness=1

# File descriptor limit (in /etc/security/limits.conf):
elasticsearch  -  nofile  65535

Performance Checklist

Use bulk API for batch indexing — individual document writes are 10-100x slower
Use filter context in bool queries — filters are cached and skip scoring
Avoid wildcard queries with leading wildcards — *board scans every term in the index
Use keyword type for sorting and aggregations — sorting on text fields requires fielddata (very memory-intensive)
Right-size shards — aim for 10-50 GB per shard, avoid thousands of tiny shards
Use SSD storage — Elasticsearch is I/O intensive; SSDs improve performance dramatically
Force merge read-only indexes — merging segments improves query speed on indexes that are no longer written to

11. Elasticsearch with Kibana

Kibana is the official visualization platform for Elasticsearch. It provides a web UI for querying, building dashboards, and managing your cluster.

Discover — explore and filter your data interactively. Search logs, inspect documents, and see field distributions.
Dashboard — combine visualizations into interactive dashboards. Share with your team for monitoring and analysis.
Dev Tools Console — write and test Elasticsearch queries directly in the browser with autocomplete and formatting.
Index Management — view index health, manage ILM policies, and configure index templates.
Alerting — set up rules to notify you when query conditions are met (e.g., error rate spikes above a threshold).

# Create a data view (formerly index pattern) in Kibana:
# 1. Go to Stack Management > Data Views
# 2. Create data view: "logs-*"
# 3. Set @timestamp as the time field
# 4. Go to Discover to explore your data

# Kibana also supports Canvas (pixel-perfect reports),
# Lens (drag-and-drop visualizations), and Maps (geospatial data)

12. Security

Elasticsearch 8.x enables security by default. Always configure authentication, TLS, and role-based access in production.

Authentication

# Reset the elastic superuser password
bin/elasticsearch-reset-password -u elastic

# Create a user with a specific role
POST /_security/user/app_reader
{
  "password": "secure_password_here",
  "roles": ["reader_role"],
  "full_name": "Application Reader"
}

# Connect with credentials
curl -u elastic:your_password https://localhost:9200

TLS/SSL and Role-Based Access

# Generate certificates (built-in tool)
bin/elasticsearch-certutil ca
bin/elasticsearch-certutil cert --ca elastic-stack-ca.p12

# elasticsearch.yml — enable transport and HTTP TLS
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.keystore.path: elastic-certificates.p12
xpack.security.http.ssl.enabled: true
xpack.security.http.ssl.keystore.path: http.p12

# Create a read-only role
POST /_security/role/reader_role
{
  "indices": [{
    "names": ["products*", "logs-*"],
    "privileges": ["read", "view_index_metadata"]
  }]
}

# Create a writer role
POST /_security/role/writer_role
{
  "indices": [{
    "names": ["products"],
    "privileges": ["write", "create_index", "read"]
  }]
}

13. Common Pitfalls and Troubleshooting

Mapping explosion — dynamic mapping creates a field for every new JSON key. Set "dynamic": "strict" to reject unexpected fields, or "dynamic": "false" to ignore them.
Yellow cluster status — means replicas cannot be allocated. On a single-node cluster, set number_of_replicas: 0. On multi-node, ensure you have enough nodes for replica placement.
Max shards per node exceeded — the default limit is 1000 shards per node. Delete old indexes, use ILM to manage lifecycle, and right-size your shard count.
Slow queries on text fields — avoid sorting or aggregating on text fields. Use keyword sub-fields instead. Enable fielddata only as a last resort.
Out of memory (OOM) — Elasticsearch heap should be max 50% of RAM, never more than 31 GB. Leave the rest for the OS file cache, which Lucene relies on heavily.
Circuit breaker exceptions — queries requiring too much memory are rejected. Reduce aggregation cardinality, add filters, or increase the circuit breaker limit carefully.

# Check cluster health
GET /_cluster/health

# See unassigned shards
GET /_cat/shards?v&s=state&h=index,shard,state,unassigned.reason

# Check node resource usage
GET /_cat/nodes?v&h=name,heap.percent,ram.percent,cpu,disk.used_percent

# Find slow queries in the slow log
PUT /products/_settings
{
  "index.search.slowlog.threshold.query.warn": "5s",
  "index.search.slowlog.threshold.query.info": "2s"
}

14. Elasticsearch vs Alternatives

Apache Solr — also built on Lucene, similar full-text search capabilities. Solr has excellent XML/schema support and is mature. Elasticsearch wins on ease of setup, REST API design, real-time analytics, and the Kibana ecosystem. Choose Solr if you already have Solr expertise or need advanced XML handling.
Meilisearch — a lightweight, fast search engine optimized for front-end search. Instant results, typo tolerance, and faceting out of the box. Ideal for small-to-medium datasets (under 10M documents) where developer experience matters. Not suited for log analytics or complex aggregations.
Typesense — similar to Meilisearch: simple API, fast typo-tolerant search, easy to operate. Better hardware efficiency than Elasticsearch for simple search use cases. Lacks the aggregation depth and ecosystem of Elasticsearch.
OpenSearch — an open-source fork of Elasticsearch 7.10, maintained by AWS. API-compatible with Elasticsearch. Choose OpenSearch if you want a fully open-source license (Apache 2.0) or run on AWS.

When to choose Elasticsearch: you need full-text search at scale, complex aggregations, log analytics, the ELK ecosystem, or geospatial queries. When to choose an alternative: you need a simple search box for a small dataset (Meilisearch/Typesense) or want a purely open-source license (OpenSearch).

Frequently Asked Questions

What is the difference between an Elasticsearch index and a database table?

An Elasticsearch index is roughly analogous to a database table, but stores JSON documents instead of rows, uses mappings instead of a fixed schema, and automatically indexes every field for full-text search. Indexes are distributed across shards for horizontal scaling, and documents do not need identical fields, giving you schema flexibility.

How many shards should I configure for an Elasticsearch index?

A good starting point is one shard per 10-50 GB of data. Each shard consumes memory and file descriptors, so too many small shards waste resources. For indexes under 10 GB, a single shard is usually sufficient. Keep total shards under 20 per GB of heap memory across the cluster.

When should I use Elasticsearch instead of a relational database?

Use Elasticsearch when you need full-text search with relevance scoring, fuzzy matching, or autocomplete. It excels at log analytics, searching millions of documents, real-time aggregations, and geospatial queries. Do not use it as a primary database for transactional data requiring ACID guarantees or complex joins. The most common pattern is running Elasticsearch alongside a relational database.

How does Elasticsearch handle full-text search differently from SQL LIKE?

SQL LIKE performs pattern matching on raw text and cannot use indexes efficiently. Elasticsearch uses inverted indexes: text is tokenized, lowercased, and stemmed during ingestion, so a search for "running" matches "run", "runs", and "running" automatically. Results are scored by relevance using BM25, making full-text search orders of magnitude faster and more useful than LIKE queries.

What is the ELK Stack and how do the components work together?

The ELK Stack consists of Elasticsearch (search and storage), Logstash (data ingestion and transformation), and Kibana (visualization and dashboards). Beats are lightweight data shippers that send logs, metrics, and network data to the stack. Together, they form a complete observability and search platform used by thousands of organizations for log management and analytics.

Conclusion

Elasticsearch is the most powerful search and analytics engine available today. Start with a single-node Docker setup, define explicit mappings for your indexes, and use bool queries with filter context for fast, relevant search. As your data grows, leverage index lifecycle management to automate data retention, bulk operations for efficient indexing, and the Kibana ecosystem for visualization.

For production, always enable security (TLS + authentication), right-size your shards (10-50 GB each), set JVM heap to half your RAM (max 31 GB), and use aliases with reindexing for zero-downtime schema changes.

⚙ Related: Format JSON with our JSON Formatter, deploy with Docker Compose, and store persistent data in PostgreSQL.

Learn More

JSON Formatter — format and validate Elasticsearch JSON queries and responses
Docker Compose: The Complete Guide — deploy Elasticsearch and Kibana with a single YAML file
PostgreSQL Complete Guide — relational database to pair with Elasticsearch as your source of truth
Redis Complete Guide — caching layer to complement Elasticsearch for frequently accessed data