Pydantic: The Complete Guide for 2026
Pydantic is the standard for data validation in Python. It uses type hints to validate, coerce, and serialize data at runtime — catching bad data at the boundary instead of letting it silently corrupt your application. With its v2 rewrite powered by a Rust core, Pydantic is now 5–50x faster than v1 and is used by FastAPI, LangChain, SQLModel, Prefect, and thousands of production systems.
This guide covers everything from basic models to advanced patterns: field constraints, custom validators, nested models, serialization, generics, computed fields, settings management, and FastAPI integration. All examples use Pydantic v2 syntax.
Table of Contents
1. What Is Pydantic
Pydantic is a data validation library that enforces type hints at runtime. You define a model class with annotated fields, and Pydantic validates every value when you create an instance. Invalid data raises a clear ValidationError with the exact field and reason. Valid data is coerced to the correct type and stored as a model instance with attribute access, serialization, and JSON Schema generation built in.
pip install pydantic
from pydantic import BaseModel, ValidationError
class User(BaseModel):
name: str
age: int
email: str
# Valid data - coerces age from string to int
user = User(name="Alice", age="30", email="alice@example.com")
print(user.age) # 30 (int, not str)
print(type(user.age)) # <class 'int'>
# Invalid data - raises ValidationError
try:
User(name="Bob", age="not a number", email="bob@example.com")
except ValidationError as e:
print(e.error_count()) # 1
print(e.errors()[0]["msg"]) # "Input should be a valid integer..."
Pydantic validates at the boundary — where external data enters your system. Once you have a model instance, every field is guaranteed to have the correct type. This eliminates entire categories of runtime bugs.
2. BaseModel Basics
Every Pydantic model inherits from BaseModel. Define fields with type annotations. Use = for defaults, Optional for nullable fields, and list/dict for collection types:
from pydantic import BaseModel
from datetime import datetime
from typing import Optional
class Product(BaseModel):
name: str
price: float
quantity: int = 0 # Default value
tags: list[str] = [] # Default empty list
metadata: dict[str, str] = {} # Default empty dict
description: Optional[str] = None # Nullable field
created_at: datetime = datetime.now # Callable default
# Create from keyword arguments
p = Product(name="Widget", price=9.99, tags=["sale"])
# Create from a dictionary
data = {"name": "Gadget", "price": 19.99, "quantity": 5}
p2 = Product(**data)
# Access fields as attributes
print(p.name) # "Widget"
print(p.tags) # ["sale"]
# Models are immutable by default in v2
# p.price = 12.99 # raises AttributeError unless you configure it
To allow mutation, set model_config:
from pydantic import BaseModel, ConfigDict
class MutableProduct(BaseModel):
model_config = ConfigDict(frozen=False)
name: str
price: float
p = MutableProduct(name="Widget", price=9.99)
p.price = 12.99 # Works now
3. Field Validation
The Field() function adds constraints, metadata, and documentation to individual fields. It replaces raw default values with rich validation rules:
from pydantic import BaseModel, Field
class User(BaseModel):
name: str = Field(min_length=1, max_length=100)
age: int = Field(ge=0, le=150) # >= 0, <= 150
email: str = Field(pattern=r'^[\w.+-]+@[\w-]+\.[\w.]+$')
score: float = Field(gt=0, lt=100.0) # exclusive bounds
tags: list[str] = Field(default_factory=list, max_length=10)
# Numeric constraints
class Order(BaseModel):
quantity: int = Field(gt=0, description="Must order at least 1")
price: float = Field(ge=0.01, le=999999.99)
discount: float = Field(default=0, ge=0, le=1) # 0-100%
# String constraints
class Article(BaseModel):
title: str = Field(min_length=5, max_length=200)
slug: str = Field(pattern=r'^[a-z0-9]+(?:-[a-z0-9]+)*$')
body: str = Field(min_length=50)
# Field with alias (useful for JSON keys that aren't valid Python names)
class ApiResponse(BaseModel):
status_code: int = Field(alias="statusCode")
error_message: str = Field(alias="errorMessage", default="")
resp = ApiResponse(**{"statusCode": 200, "errorMessage": ""})
print(resp.status_code) # 200
Field constraints are enforced during validation. If any constraint fails, Pydantic raises a ValidationError with the field name, the value that failed, and which constraint was violated.
4. Custom Validators
When built-in constraints are not enough, use @field_validator for single fields and @model_validator for cross-field logic:
from pydantic import BaseModel, field_validator, model_validator
class Signup(BaseModel):
username: str
password: str
password_confirm: str
email: str
@field_validator("username")
@classmethod
def username_alphanumeric(cls, v: str) -> str:
if not v.isalnum():
raise ValueError("Username must be alphanumeric")
return v.lower() # Transform: normalize to lowercase
@field_validator("password")
@classmethod
def password_strength(cls, v: str) -> str:
if len(v) < 8:
raise ValueError("Password must be at least 8 characters")
if not any(c.isupper() for c in v):
raise ValueError("Password must contain an uppercase letter")
if not any(c.isdigit() for c in v):
raise ValueError("Password must contain a digit")
return v
@model_validator(mode="after")
def passwords_match(self) -> "Signup":
if self.password != self.password_confirm:
raise ValueError("Passwords do not match")
return self
Before, After, and Wrap Modes
Validators run at different stages. mode="before" runs before type coercion (you get the raw input). mode="after" runs after coercion (you get the typed value). mode="wrap" lets you control whether inner validation runs at all:
from pydantic import BaseModel, field_validator
class FlexibleDate(BaseModel):
date: str
@field_validator("date", mode="before")
@classmethod
def normalize_date(cls, v):
"""Runs before type validation - can transform raw input."""
if isinstance(v, int):
# Convert Unix timestamp to ISO string
from datetime import datetime, timezone
return datetime.fromtimestamp(v, tz=timezone.utc).isoformat()
return v
class Temperature(BaseModel):
celsius: float
@field_validator("celsius", mode="after")
@classmethod
def reasonable_temp(cls, v: float) -> float:
"""Runs after coercion - v is already a float."""
if v < -273.15:
raise ValueError("Temperature below absolute zero")
return round(v, 2)
5. Nested Models and Complex Types
Pydantic models compose naturally. Use one model as a field type in another, and Pydantic validates the entire tree recursively:
from pydantic import BaseModel, Field
from typing import Optional
from enum import Enum
class AddressType(str, Enum):
HOME = "home"
WORK = "work"
BILLING = "billing"
class Address(BaseModel):
street: str
city: str
state: str = Field(min_length=2, max_length=2)
zip_code: str = Field(pattern=r'^\d{5}(-\d{4})?$')
type: AddressType = AddressType.HOME
class Company(BaseModel):
name: str
address: Address # Nested model
class Employee(BaseModel):
name: str
email: str
company: Company # Nested 2 levels deep
addresses: list[Address] = [] # List of nested models
manager: Optional["Employee"] = None # Self-referencing model
# Pydantic validates the entire nested structure
emp = Employee(
name="Alice",
email="alice@corp.com",
company={
"name": "Acme",
"address": {"street": "123 Main", "city": "NY", "state": "NY", "zip_code": "10001"}
},
addresses=[
{"street": "456 Oak", "city": "LA", "state": "CA", "zip_code": "90001", "type": "home"}
]
)
Use typing.Union for discriminated unions and typing.Literal for fixed values:
from pydantic import BaseModel
from typing import Literal, Union
class CreditCard(BaseModel):
type: Literal["credit_card"]
card_number: str
expiry: str
class BankTransfer(BaseModel):
type: Literal["bank_transfer"]
account_number: str
routing_number: str
class Payment(BaseModel):
amount: float
method: Union[CreditCard, BankTransfer] = Field(discriminator="type")
6. Serialization
Pydantic v2 provides model_dump() for dictionaries, model_dump_json() for JSON strings, and model_validate() to parse data back into models:
from pydantic import BaseModel
from datetime import datetime
class Event(BaseModel):
name: str
start: datetime
tags: list[str] = []
internal_id: int = 0
event = Event(name="Launch", start="2026-03-01T10:00:00", tags=["product"])
# To dictionary
d = event.model_dump()
# {'name': 'Launch', 'start': datetime(...), 'tags': ['product'], 'internal_id': 0}
# Exclude fields
d = event.model_dump(exclude={"internal_id"})
# Exclude unset fields (only include fields explicitly passed)
d = event.model_dump(exclude_unset=True)
# {'name': 'Launch', 'start': datetime(...), 'tags': ['product']}
# Include only specific fields
d = event.model_dump(include={"name", "start"})
# To JSON string (uses Rust serializer - very fast)
json_str = event.model_dump_json(indent=2)
# Parse back from dict or JSON
event2 = Event.model_validate({"name": "Demo", "start": "2026-04-01T14:00:00"})
event3 = Event.model_validate_json('{"name":"Demo","start":"2026-04-01T14:00:00"}')
# Generate JSON Schema
schema = Event.model_json_schema()
print(schema)
# {'properties': {'name': {'title': 'Name', 'type': 'string'}, ...}}
Use model_dump(mode="json") to get a JSON-compatible dictionary where datetimes become strings and enums become values.
7. Generic Models
Generic models let you create reusable wrappers with type-safe contents. This is ideal for API response envelopes, paginated results, and container types:
from pydantic import BaseModel
from typing import Generic, TypeVar, Optional
T = TypeVar("T")
class ApiResponse(BaseModel, Generic[T]):
success: bool
data: Optional[T] = None
error: Optional[str] = None
class PaginatedResponse(BaseModel, Generic[T]):
items: list[T]
total: int
page: int
page_size: int
class User(BaseModel):
id: int
name: str
# Type-safe instantiation
response = ApiResponse[User](success=True, data=User(id=1, name="Alice"))
page = PaginatedResponse[User](
items=[User(id=1, name="Alice"), User(id=2, name="Bob")],
total=50, page=1, page_size=20
)
# The generic parameter is validated
# ApiResponse[User](success=True, data={"invalid": "data"}) # ValidationError
8. Computed Fields
Computed fields are derived from other fields. They appear in serialization output but are not part of the input schema. Use the @computed_field decorator:
from pydantic import BaseModel, computed_field
class Rectangle(BaseModel):
width: float
height: float
@computed_field
@property
def area(self) -> float:
return self.width * self.height
@computed_field
@property
def perimeter(self) -> float:
return 2 * (self.width + self.height)
rect = Rectangle(width=5, height=3)
print(rect.area) # 15.0
print(rect.perimeter) # 16.0
# Computed fields appear in serialization
print(rect.model_dump())
# {'width': 5.0, 'height': 3.0, 'area': 15.0, 'perimeter': 16.0}
class User(BaseModel):
first_name: str
last_name: str
@computed_field
@property
def full_name(self) -> str:
return f"{self.first_name} {self.last_name}"
9. Settings Management
The pydantic-settings package lets you define application configuration as a Pydantic model. Each field maps to an environment variable, with full validation on startup:
pip install pydantic-settings
from pydantic_settings import BaseSettings, SettingsConfigDict
from pydantic import SecretStr, Field
class Settings(BaseSettings):
model_config = SettingsConfigDict(
env_file=".env", # Load from .env file
env_file_encoding="utf-8",
env_prefix="APP_", # APP_DATABASE_URL, APP_DEBUG, etc.
case_sensitive=False,
)
database_url: str
redis_url: str = "redis://localhost:6379"
secret_key: SecretStr # Hidden in repr/logs
debug: bool = False
allowed_hosts: list[str] = ["localhost"]
max_connections: int = Field(default=10, ge=1, le=100)
# Environment variables or .env file:
# APP_DATABASE_URL=postgresql://user:pass@localhost/mydb
# APP_SECRET_KEY=super-secret-key-here
# APP_DEBUG=true
# APP_ALLOWED_HOSTS=["example.com","api.example.com"]
settings = Settings()
print(settings.database_url) # "postgresql://..."
print(settings.secret_key.get_secret_value()) # "super-secret-key-here"
print(settings.secret_key) # SecretStr('**********')
print(settings.debug) # True (coerced from string)
For nested settings, use env_nested_delimiter:
class DatabaseSettings(BaseModel):
host: str = "localhost"
port: int = 5432
name: str = "mydb"
class AppSettings(BaseSettings):
model_config = SettingsConfigDict(env_nested_delimiter="__")
db: DatabaseSettings = DatabaseSettings()
# Set via: DB__HOST=prod-db.example.com DB__PORT=5433
10. Pydantic with FastAPI
FastAPI is built on Pydantic. Request bodies, query parameters, response models, and dependency injection all use Pydantic models. FastAPI automatically validates input, serializes output, and generates OpenAPI documentation from your models:
from fastapi import FastAPI, HTTPException, Query
from pydantic import BaseModel, Field, EmailStr
from typing import Optional
app = FastAPI()
class UserCreate(BaseModel):
name: str = Field(min_length=1, max_length=100)
email: EmailStr
age: int = Field(ge=13, le=150)
class UserResponse(BaseModel):
id: int
name: str
email: str
@app.post("/users/", response_model=UserResponse, status_code=201)
async def create_user(user: UserCreate):
# user is already validated by Pydantic
db_user = save_to_db(user.model_dump())
return db_user
@app.get("/users/", response_model=list[UserResponse])
async def list_users(
skip: int = Query(default=0, ge=0),
limit: int = Query(default=20, ge=1, le=100),
search: Optional[str] = None,
):
return get_users(skip=skip, limit=limit, search=search)
Separate your create, update, and response models. Use a base model for shared fields:
class UserBase(BaseModel):
name: str = Field(min_length=1, max_length=100)
email: EmailStr
class UserCreate(UserBase):
password: str = Field(min_length=8)
class UserUpdate(BaseModel):
name: Optional[str] = Field(default=None, min_length=1, max_length=100)
email: Optional[EmailStr] = None
class UserResponse(UserBase):
id: int
created_at: datetime
11. Pydantic vs Dataclasses vs Attrs
| Feature | Pydantic | dataclasses | attrs |
|---|---|---|---|
| Runtime validation | Yes (automatic) | No | Optional (validators) |
| Type coercion | Yes ("30" → 30) | No | No |
| JSON serialization | Built-in (fast Rust) | Manual | Via cattrs |
| JSON Schema | Built-in | No | No |
| Settings / env vars | pydantic-settings | No | No |
| Stdlib | No (pip install) | Yes (3.7+) | No (pip install) |
| Performance | Fast (Rust core) | Fastest (no validation) | Fast (C slots) |
When to use each: Use Pydantic for external data (APIs, configs, user input) where validation matters. Use dataclasses for internal data structures where you trust the types. Use attrs when you need lightweight classes with optional validation and do not need JSON Schema or serialization.
12. Performance in V2
Pydantic v2 replaced its pure-Python validation core with pydantic-core, a Rust library compiled to a Python C extension. The result is dramatic: model creation is 5–50x faster depending on the model complexity.
- Simple models: ~5x faster than v1
- Nested models: ~17x faster than v1
- JSON parsing: ~20x faster (Rust JSON parser bypasses Python dict creation)
- JSON serialization: ~10x faster with
model_dump_json()
Key performance tips:
from pydantic import BaseModel, ConfigDict
class FastModel(BaseModel):
model_config = ConfigDict(
# Skip validation for trusted data
# model_validate(..., strict=False) is default
)
name: str
value: int
# Use model_validate_json() instead of json.loads() + model_validate()
# This is faster because Rust parses JSON directly into the model
data = '{"name": "test", "value": 42}'
m = FastModel.model_validate_json(data) # Fastest path
# For bulk operations, use TypeAdapter for validation without a class
from pydantic import TypeAdapter
adapter = TypeAdapter(list[FastModel])
items = adapter.validate_json(json_bytes) # Validates entire list in Rust
Strict mode disables type coercion, which is slightly faster and catches type mismatches that coercion would silently fix:
class StrictUser(BaseModel):
model_config = ConfigDict(strict=True)
name: str
age: int
StrictUser(name="Alice", age=30) # OK
# StrictUser(name="Alice", age="30") # ValidationError: age must be int, not str
13. Common Patterns and Best Practices
Separate input and output models. Do not use the same model for creating and reading data. Create models strip sensitive fields; response models add computed fields like id and created_at:
class UserCreate(BaseModel):
email: str
password: str # Input only
class UserDB(BaseModel):
id: int
email: str
hashed_password: str # Never expose
class UserResponse(BaseModel):
id: int
email: str # No password field
Use model_config instead of inner Config class. The v1 Config class still works but is deprecated:
# V2 style (preferred)
class MyModel(BaseModel):
model_config = ConfigDict(
str_strip_whitespace=True,
str_min_length=1,
populate_by_name=True,
use_enum_values=True,
)
Use TypeAdapter for standalone validation when you do not need a full model class:
from pydantic import TypeAdapter
# Validate a plain list of integers
int_list = TypeAdapter(list[int])
result = int_list.validate_python(["1", "2", "3"]) # [1, 2, 3]
# Validate a union type
from typing import Union
adapter = TypeAdapter(Union[int, str])
adapter.validate_python(42) # 42
adapter.validate_python("hi") # "hi"
Additional best practices:
- Use
EmailStrfrompydantic[email]for email validation instead of regex patterns - Use
SecretStrfor passwords and API keys — they are masked inrepr()and logs - Use
model_validate_json()instead ofjson.loads()followed bymodel_validate()for best performance - Use
Annotatedtypes for reusable field constraints:PositiveInt = Annotated[int, Field(gt=0)] - Pin your Pydantic version in production — minor versions can change validation behavior
Frequently Asked Questions
What is Pydantic and why should I use it?
Pydantic is a Python data validation library that uses type hints to validate, parse, and serialize data. It enforces type safety at runtime, catching invalid data before it causes bugs deep in your application. Pydantic is the most widely used validation library in Python, powering frameworks like FastAPI, LangChain, and SQLModel. Use it whenever you handle external data: API requests, config files, database records, or user input.
What changed between Pydantic v1 and v2?
Pydantic v2 is a complete rewrite with a Rust-based core (pydantic-core) that makes validation 5–50x faster. Key API changes: @validator becomes @field_validator, @root_validator becomes @model_validator, .dict() becomes .model_dump(), .json() becomes .model_dump_json(), .parse_obj() becomes .model_validate(), and the Config class becomes model_config dict. V2 also adds strict mode, computed fields, better JSON Schema generation, and more flexible serialization.
What is the difference between Pydantic and Python dataclasses?
Python dataclasses generate __init__, __repr__, and __eq__ methods but perform zero runtime validation. If you pass a string where an int is expected, dataclasses silently accept it. Pydantic validates and coerces every field at runtime, raises clear errors for invalid data, and provides serialization methods like model_dump() and model_dump_json(). Pydantic also supports nested validation, custom validators, JSON Schema export, and settings management. Use dataclasses for simple internal data containers. Use Pydantic when data crosses a trust boundary.
How do Pydantic validators work?
Pydantic v2 provides two decorator types: @field_validator for single fields and @model_validator for cross-field logic. Field validators receive the field value and can run in "before" mode (before type coercion), "after" mode (after coercion, the default), or "wrap" mode (control whether inner validation runs). Model validators receive the entire model and run before or after all field validation. Validators raise ValueError or AssertionError to reject data, and return the validated value to accept or transform it.
How do I use pydantic-settings for configuration?
Install pydantic-settings and create a class that inherits from BaseSettings. Each field maps to an environment variable (case-insensitive by default). Set model_config with env_file=".env" to load from dotenv files. Pydantic validates all settings on instantiation, catching missing or invalid config immediately. You can set env_prefix to namespace variables, use env_nested_delimiter for nested settings, and define SecretStr fields to prevent secrets from appearing in logs or repr output.
Conclusion
Pydantic solves data validation in Python. Define your models with type hints, and Pydantic handles validation, coercion, serialization, and JSON Schema generation. The v2 rewrite makes it fast enough for the most demanding applications, and its integration with FastAPI, SQLModel, and the broader Python ecosystem means you can use one validation approach across your entire stack.
Start with BaseModel and Field() for your next project. Add custom validators as your business rules grow. Use pydantic-settings for configuration. The patterns in this guide will keep your data clean from API boundary to database.