Dockerfile: The Complete Guide for 2026
A Dockerfile is a plain text file containing instructions that Docker reads to build a container image. Each instruction adds a layer to the image, creating a reproducible, portable artifact that runs the same way everywhere — on your laptop, in CI, and in production. Understanding how to write efficient Dockerfiles is the single most important Docker skill you can develop.
This guide covers every Dockerfile instruction, explains the tradeoffs behind common decisions (CMD vs ENTRYPOINT, COPY vs ADD), and shows you how to write production-quality Dockerfiles for Python, Node.js, and Go applications.
Table of Contents
- Every Dockerfile Instruction Explained
- CMD vs ENTRYPOINT
- COPY vs ADD
- Layer Caching and Build Optimization
- The .dockerignore File
- Multi-Stage Builds
- Best Practices
- Complete Examples: Python, Node.js, Go
- Security: Scanning, Minimal Images, Non-Root
- Debugging Dockerfile Builds
- Frequently Asked Questions
1. Every Dockerfile Instruction Explained
A Dockerfile supports 16 instructions. Here is every one of them with practical explanations.
FROM — Set the Base Image
Every Dockerfile starts with FROM. It sets the base image that all subsequent instructions build upon. You can use official images from Docker Hub or your own custom images.
FROM python:3.12-slim # Debian-based, smaller than full image
FROM node:22-alpine # Alpine-based, minimal footprint
FROM golang:1.22-alpine AS builder # Named stage for multi-stage builds
FROM scratch # Empty image, for static binaries
RUN — Execute Commands During Build
RUN executes a command inside the image during the build process. Each RUN creates a new layer.
# Install system packages
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*
# Alpine equivalent
RUN apk add --no-cache curl ca-certificates
COPY — Copy Files from Build Context
COPY takes files from your local build context and places them into the image. It is the recommended way to get files into your image.
COPY package.json package-lock.json ./
COPY src/ ./src/
COPY --chown=appuser:appgroup . /app/ # Set ownership while copying
ADD — Copy with Extra Features
ADD does everything COPY does, plus it can extract tar archives and download URLs. Use COPY unless you specifically need tar extraction.
ADD app.tar.gz /app/ # Extracts the archive automatically
ADD https://example.com/file /app/file # Downloads from URL (not cached well)
WORKDIR — Set the Working Directory
WORKDIR sets the working directory for all subsequent RUN, CMD, ENTRYPOINT, COPY, and ADD instructions. It creates the directory if it does not exist.
WORKDIR /app # All subsequent commands run from /app
COPY . . # Copies into /app
RUN npm install # Runs in /app
CMD — Default Container Command
CMD specifies the default command when a container starts. It can be overridden at runtime.
# Exec form (preferred) - runs process directly
CMD ["python", "app.py"]
# Shell form - wraps in /bin/sh -c
CMD python app.py
ENTRYPOINT — Container Executable
ENTRYPOINT sets the container's main executable. Unlike CMD, it is not overridden by command-line arguments — those arguments are appended to it.
ENTRYPOINT ["python", "app.py"]
# docker run myapp --verbose => python app.py --verbose
ENV — Set Environment Variables
ENV sets environment variables that persist in the built image and in running containers.
ENV NODE_ENV=production
ENV APP_PORT=3000
ENV PATH="/app/bin:$PATH"
ARG — Build-Time Variables
ARG defines variables available only during the build process. They do not persist in the final image.
ARG PYTHON_VERSION=3.12
FROM python:${PYTHON_VERSION}-slim
ARG BUILD_DATE
LABEL org.opencontainers.image.created=$BUILD_DATE
# Pass at build time: docker build --build-arg PYTHON_VERSION=3.11 .
EXPOSE — Document Container Ports
EXPOSE documents which ports the container listens on. It does not publish the port — that requires -p at runtime or ports: in Compose.
EXPOSE 8080
EXPOSE 8080/tcp 9090/udp
VOLUME — Declare Mount Points
VOLUME creates a mount point and marks it as holding externally mounted volumes. Data written here is preserved outside the container's writable layer.
VOLUME /data
VOLUME ["/var/log", "/app/uploads"]
USER — Set the Runtime User
USER switches the user for all subsequent RUN, CMD, and ENTRYPOINT instructions. Critical for security.
RUN addgroup --system appgroup && adduser --system --ingroup appgroup appuser
USER appuser
LABEL — Add Metadata
LABEL adds key-value metadata to the image. Useful for versioning, maintainer info, and tooling.
LABEL maintainer="team@example.com"
LABEL org.opencontainers.image.version="1.2.3"
LABEL org.opencontainers.image.description="My application"
HEALTHCHECK — Container Health Monitoring
HEALTHCHECK tells Docker how to verify the container is working correctly.
HEALTHCHECK --interval=30s --timeout=10s --retries=3 --start-period=5s \
CMD curl -f http://localhost:8080/health || exit 1
HEALTHCHECK NONE # Disable inherited health check
SHELL — Change the Default Shell
SHELL changes the default shell used for shell-form RUN, CMD, and ENTRYPOINT instructions.
SHELL ["/bin/bash", "-o", "pipefail", "-c"]
RUN curl -fsSL https://example.com/setup.sh | bash
STOPSIGNAL — Set the Stop Signal
STOPSIGNAL sets the system call signal that will be sent to the container to exit. Defaults to SIGTERM.
STOPSIGNAL SIGQUIT # Nginx graceful shutdown
2. CMD vs ENTRYPOINT
This is the most commonly confused pair of Dockerfile instructions. Here is the definitive explanation.
# CMD alone: provides a default, easily overridden
FROM python:3.12-slim
CMD ["python", "app.py"]
# docker run myimage => python app.py
# docker run myimage python test.py => python test.py (CMD replaced)
# ENTRYPOINT alone: fixed executable, args appended
FROM python:3.12-slim
ENTRYPOINT ["python", "app.py"]
# docker run myimage => python app.py
# docker run myimage --verbose => python app.py --verbose
# ENTRYPOINT + CMD: fixed executable with default args
FROM python:3.12-slim
ENTRYPOINT ["python"]
CMD ["app.py"]
# docker run myimage => python app.py
# docker run myimage test.py => python test.py (CMD replaced)
Exec Form vs Shell Form
# Exec form (JSON array) - PREFERRED
# Process runs as PID 1, receives signals correctly
CMD ["node", "server.js"]
ENTRYPOINT ["python", "-u", "app.py"]
# Shell form (plain string) - AVOID in production
# Wraps in /bin/sh -c, process is NOT PID 1
# Container does not respond to SIGTERM correctly
CMD node server.js
ENTRYPOINT python -u app.py
Rule of thumb: Use CMD for general-purpose images where users might want to run different commands. Use ENTRYPOINT for images that wrap a specific tool or application. Always use exec form in production.
3. COPY vs ADD
Docker's official best practices are clear: use COPY unless you specifically need ADD's tar extraction.
# COPY - transparent, predictable
COPY requirements.txt . # Copies the file, nothing more
COPY src/ ./src/ # Copies the directory
# ADD - has hidden behavior
ADD app.tar.gz /app/ # Extracts tar automatically (useful!)
ADD https://example.com/f /f # Downloads URL (poorly cached, use curl instead)
ADD config.tar.gz /etc/ # Extracts into /etc/
The problem with ADD is that readers of your Dockerfile cannot immediately tell whether ADD myfile /app/ is doing a simple copy or extracting an archive. COPY is always a simple copy, making the intent obvious.
4. Layer Caching and Build Optimization
Docker builds images layer by layer, top to bottom. Each instruction creates a layer. Docker caches layers and reuses them when nothing has changed. Once a layer is invalidated (cache miss), all subsequent layers must be rebuilt.
# BAD: Changing any source file invalidates npm install cache
COPY . /app/
RUN npm install
RUN npm run build
# GOOD: Dependencies cached separately from source code
COPY package.json package-lock.json /app/
RUN npm install --production
COPY . /app/
RUN npm run build
The key insight: order instructions from least-frequently-changed to most-frequently-changed. System package installation changes rarely. Dependency files change occasionally. Source code changes constantly.
# Optimal layer ordering
FROM node:22-alpine # 1. Base image (changes rarely)
RUN apk add --no-cache dumb-init # 2. System packages (changes rarely)
WORKDIR /app # 3. Working directory (never changes)
COPY package.json package-lock.json ./ # 4. Dependency files (changes sometimes)
RUN npm ci --production # 5. Install dependencies (cached if #4 unchanged)
COPY . . # 6. Source code (changes frequently)
CMD ["dumb-init", "node", "server.js"] # 7. Run command (changes rarely)
5. The .dockerignore File
The .dockerignore file prevents files from being sent to the Docker daemon as part of the build context. This speeds up builds and prevents sensitive or unnecessary files from ending up in your image.
# .dockerignore
# Version control
.git
.gitignore
# Dependencies (will be installed in the image)
node_modules
__pycache__
*.pyc
.venv
vendor/
# Build artifacts
dist/
build/
*.egg-info/
# IDE and editor files
.vscode/
.idea/
*.swp
*.swo
# Docker files (not needed inside the image)
Dockerfile*
docker-compose*.yml
.dockerignore
# Environment and secrets
.env
.env.*
*.pem
*.key
# Documentation and tests
README.md
docs/
tests/
*.test.js
*.spec.js
# OS files
.DS_Store
Thumbs.db
Without a .dockerignore, your entire project directory (including .git/, node_modules/, and any secrets) is sent to the Docker daemon and potentially copied into the image. A .git directory alone can be hundreds of megabytes.
6. Multi-Stage Builds
Multi-stage builds use multiple FROM statements. Each FROM starts a new stage. You can copy artifacts between stages, keeping only what you need in the final image.
# Stage 1: Build
FROM node:22-alpine AS builder
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build
# Stage 2: Production (only the built output)
FROM nginx:1.25-alpine
COPY --from=builder /app/dist /usr/share/nginx/html
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
# Result: final image has nginx + built files only
# No Node.js, no node_modules, no source code
Multi-stage builds are especially powerful for compiled languages where the build toolchain is large but the output is small:
# Go: 1GB+ build image => 10MB final image
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -ldflags="-s -w" -o /app/server .
FROM scratch
COPY --from=builder /app/server /server
EXPOSE 8080
ENTRYPOINT ["/server"]
7. Best Practices
Minimize the Number of Layers
# BAD: 4 layers for package installation
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get install -y ca-certificates
RUN rm -rf /var/lib/apt/lists/*
# GOOD: 1 layer, cleanup in the same RUN
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*
Use Specific Base Image Tags
# BAD: mutable tags, builds are not reproducible
FROM python:latest
FROM node:lts
# GOOD: pinned versions
FROM python:3.12.2-slim-bookworm
FROM node:22.2-alpine3.19
Run as Non-Root User
FROM python:3.12-slim
WORKDIR /app
# Install dependencies as root
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Create non-root user
RUN addgroup --system appgroup && \
adduser --system --ingroup appgroup appuser
# Copy app files and set ownership
COPY --chown=appuser:appgroup . .
# Switch to non-root user BEFORE CMD
USER appuser
CMD ["python", "app.py"]
Use .dockerignore (Always)
Every project with a Dockerfile should have a .dockerignore. See section 5 above.
Pin Package Versions
# BAD: installs whatever version is current
RUN pip install flask requests
# GOOD: pinned, reproducible
RUN pip install flask==3.0.2 requests==2.31.0
# BEST: use a lock file
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
8. Complete Examples: Python, Node.js, Go
Python (Flask/FastAPI)
FROM python:3.12-slim AS base
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
&& rm -rf /var/lib/apt/lists/*
# Install Python dependencies (cached separately)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Create non-root user
RUN addgroup --system app && adduser --system --ingroup app app
# Copy application code
COPY --chown=app:app . .
USER app
EXPOSE 8000
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
CMD curl -f http://localhost:8000/health || exit 1
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
Node.js (Express/Fastify)
FROM node:22-alpine AS builder
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build && npm prune --production
FROM node:22-alpine
RUN apk add --no-cache dumb-init
WORKDIR /app
# Create non-root user
RUN addgroup -S app && adduser -S app -G app
# Copy only production artifacts
COPY --from=builder --chown=app:app /app/dist ./dist
COPY --from=builder --chown=app:app /app/node_modules ./node_modules
COPY --from=builder --chown=app:app /app/package.json ./
USER app
EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1
CMD ["dumb-init", "node", "dist/server.js"]
Go (Static Binary)
FROM golang:1.22-alpine AS builder
WORKDIR /app
# Cache module downloads
COPY go.mod go.sum ./
RUN go mod download && go mod verify
# Build static binary
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build \
-ldflags="-s -w" \
-o /app/server ./cmd/server
# Final image: scratch (0 bytes base)
FROM scratch
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY --from=builder /app/server /server
EXPOSE 8080
ENTRYPOINT ["/server"]
9. Security: Scanning, Minimal Images, Non-Root
Minimal Base Images
Fewer packages mean fewer vulnerabilities. Choose the smallest base image that works for your application:
# Full Debian (~130MB) - has everything, most vulnerabilities
FROM python:3.12
# Slim Debian (~45MB) - removes docs, man pages, extra packages
FROM python:3.12-slim
# Alpine (~8MB) - musl libc, minimal userland
FROM python:3.12-alpine
# Distroless (~20MB) - Google's minimal images, no shell
FROM gcr.io/distroless/python3-debian12
# Scratch (0MB) - empty image, for static binaries only
FROM scratch
Scan Images for Vulnerabilities
# Docker Scout (built into Docker Desktop and CLI)
docker scout cves myimage:latest
docker scout quickview myimage:latest
# Trivy (open source, popular in CI)
trivy image myimage:latest
# Grype (from Anchore)
grype myimage:latest
# Snyk
snyk container test myimage:latest
Never Store Secrets in Images
# BAD: Secret is baked into a layer (visible with docker history)
ENV API_KEY=sk-secret-12345
COPY .env /app/.env
# GOOD: Pass secrets at runtime
# docker run -e API_KEY=sk-secret-12345 myimage
# Or use Docker secrets / mount a secrets file
Use BuildKit Secrets for Build-Time Credentials
# Dockerfile
RUN --mount=type=secret,id=npmrc,target=/root/.npmrc npm ci
# Build command
docker build --secret id=npmrc,src=.npmrc .
10. Debugging Dockerfile Builds
Force a Clean Build
# Rebuild without using cache
docker build --no-cache -t myapp .
# Rebuild from a specific stage
docker build --no-cache --target builder -t myapp-builder .
Inspect Image Layers
# View all layers and their sizes
docker history myapp:latest
# Detailed inspection (JSON output)
docker inspect myapp:latest
# Use dive for interactive layer exploration
dive myapp:latest
Debug a Failing Build
# Build with BuildKit progress output
DOCKER_BUILDKIT=1 docker build --progress=plain -t myapp .
# Run a shell in a partially built image (before failing step)
# Find the last successful layer hash from build output, then:
docker run --rm -it <layer-hash> /bin/sh
# Or add a debug target to your multi-stage build
FROM builder AS debug
CMD ["/bin/sh"]
# docker build --target debug -t myapp-debug .
# docker run --rm -it myapp-debug
Common Build Errors
# "COPY failed: file not found in build context"
# => File is excluded by .dockerignore or path is wrong
# => Check: docker build context is correct directory
# "RUN returned a non-zero code"
# => Command failed inside the container
# => Try running it manually: docker run --rm -it base-image sh
# "failed to solve: process '/bin/sh -c ...' did not complete successfully"
# => BuildKit error format. Read the full output above the error
# Image is unexpectedly large
# => Check: docker history myimage:latest
# => Look for large layers (apt cache, build deps not cleaned up)
Frequently Asked Questions
What is the difference between CMD and ENTRYPOINT in a Dockerfile?
CMD sets the default command that runs when a container starts, but it can be completely overridden by passing arguments to docker run. ENTRYPOINT sets the main executable for the container and cannot be overridden by command-line arguments — those arguments are appended to ENTRYPOINT instead. Use ENTRYPOINT when your container is meant to behave like a specific executable, and CMD when you want to provide a default that users can easily replace. Both support exec form (JSON array) and shell form (plain string), but exec form is preferred because it runs the process directly without a shell wrapper.
Should I use COPY or ADD in my Dockerfile?
Use COPY in almost all cases. COPY does exactly one thing: it copies files and directories from the build context into the image. ADD does the same but has two extra features: it can extract tar archives automatically and it can download files from URLs. These extra features make ADD less predictable and harder to reason about. Docker's official best practices recommend COPY for transparency. The only time ADD is genuinely useful is when you need to extract a local tar archive into the image in a single layer.
How do I optimize Docker image size?
Start with a minimal base image like alpine, slim, or distroless variants. Use multi-stage builds to separate build dependencies from the final runtime image. Combine RUN commands with && to reduce layers. Clean up package manager caches in the same RUN instruction. Use a .dockerignore file to exclude node_modules, .git, and other unnecessary files. For compiled languages like Go, consider building a static binary and using scratch as the final base image.
What is a multi-stage Docker build?
A multi-stage build uses multiple FROM statements in a single Dockerfile. Each FROM starts a new build stage with its own base image. You can copy artifacts from one stage to another using COPY --from=stagename. This lets you use a full build environment (with compilers, dev dependencies, build tools) in one stage and copy only the compiled output into a minimal runtime image in the final stage. The result is a much smaller production image that contains only what is needed to run the application.
How do Docker layers and caching work?
Each instruction in a Dockerfile creates a new read-only layer. Docker caches these layers and reuses them if nothing has changed. When you rebuild, Docker checks each instruction from top to bottom: if an instruction and its inputs are identical to a cached layer, Docker reuses the cache. Once a cache miss occurs, all subsequent layers are rebuilt. To optimize, put instructions that change rarely (installing system packages) before those that change often (copying application code). Copy dependency files before source code so dependency installation is cached separately.
How do I run a container as a non-root user?
Add a USER instruction in your Dockerfile. First, create the user with RUN (e.g., RUN addgroup --system appgroup && adduser --system --ingroup appgroup appuser), then switch to it with USER appuser. Place the USER instruction after installing packages and copying files but before CMD or ENTRYPOINT. Make sure application directories have correct ownership with --chown on COPY. Running as non-root prevents container escape vulnerabilities from gaining root access on the host.
Conclusion
A well-written Dockerfile is the foundation of a secure, efficient, and reproducible container workflow. Start with a minimal base image, order your instructions for optimal caching, use multi-stage builds to keep production images small, and always run as a non-root user. The patterns in this guide apply whether you are containerizing a Python microservice, a Node.js API, or a Go binary.
Master these fundamentals and you will spend less time debugging builds and more time shipping code.