Python Data Structures: The Complete Guide for 2026

Q: What is the collections module in Python?

The collections module provides specialized container datatypes that extend Python's built-in data structures. Key classes include Counter for counting hashable objects, deque for double-ended queues with O(1) append and pop from both ends, defaultdict for dictionaries with automatic default values for missing keys, OrderedDict for dictionaries that remember insertion order (less necessary since Python 3.7+ dicts maintain order), namedtuple for lightweight immutable objects with named fields, and ChainMap for combining multiple dictionaries into a single view. These are standard library tools that do not require any external installation.

February 12, 2026 · 30 min read

Choosing the right data structure is one of the most impactful decisions you make as a Python developer. The difference between a list and a dictionary can mean the difference between an O(n) search and an O(1) lookup. The difference between a list and a deque can mean the difference between an O(n) pop from the front and an O(1) popleft. These choices compound across your codebase and determine whether your program handles ten items or ten million.

This guide covers every data structure available in Python — from the four built-in types (list, tuple, dict, set) to the specialized containers in the collections module, the array module, heapq for priority queues, and practical implementations of stacks and queues. Every section includes working code examples and performance characteristics so you can make informed decisions.

⚙ Quick reference: Keep our Python Cheat Sheet open while reading — it covers syntax fundamentals that complement this deep dive into data structures.

Overview of Python Data Structures
Lists
Tuples
Dictionaries
Sets
Strings as Sequences
The Collections Module
Arrays and the array Module
Heapq and Priority Queues
Stacks and Queues
When to Use Which Data Structure
Performance Characteristics (Big-O)
Common Patterns and Best Practices
Frequently Asked Questions

1. Overview of Python Data Structures

Python provides four built-in data structure types that cover the vast majority of use cases:

list — ordered, mutable sequence
tuple — ordered, immutable sequence
dict — key-value mapping with O(1) lookups
set — unordered collection of unique elements

Beyond these, the standard library offers specialized structures in the collections module (Counter, deque, defaultdict, OrderedDict, namedtuple, ChainMap), the array module for memory-efficient numeric arrays, and heapq for priority queue operations. Understanding when to reach for each one separates efficient Python code from slow Python code.

# Quick overview of all built-in types
my_list = [1, 2, 3, 4, 5]              # ordered, mutable, allows duplicates
my_tuple = (1, 2, 3, 4, 5)             # ordered, immutable, allows duplicates
my_dict = {"name": "Alice", "age": 30}  # key-value pairs, O(1) lookup
my_set = {1, 2, 3, 4, 5}               # unordered, unique elements only

print(type(my_list))   # <class 'list'>
print(type(my_tuple))  # <class 'tuple'>
print(type(my_dict))   # <class 'dict'>
print(type(my_set))    # <class 'set'>

2. Lists

Lists are Python's workhorse data structure. They are ordered, mutable, and can hold items of any type. Internally, Python lists are implemented as dynamic arrays, which means indexing is O(1) but inserting at the beginning is O(n).

Creation and Basic Operations

# Creating lists
numbers = [1, 2, 3, 4, 5]
mixed = [1, "hello", 3.14, True, None]
nested = [[1, 2], [3, 4], [5, 6]]
empty = []
from_range = list(range(10))        # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
from_string = list("hello")         # ['h', 'e', 'l', 'l', 'o']

# Length, membership, concatenation
print(len(numbers))         # 5
print(3 in numbers)         # True
print(numbers + [6, 7])     # [1, 2, 3, 4, 5, 6, 7]
print(numbers * 2)          # [1, 2, 3, 4, 5, 1, 2, 3, 4, 5]

List Methods

fruits = ["apple", "banana", "cherry"]

# Adding elements
fruits.append("date")                  # adds to end: O(1) amortized
fruits.insert(1, "blueberry")          # inserts at index: O(n)
fruits.extend(["elderberry", "fig"])   # adds multiple items: O(k)

# Removing elements
fruits.remove("banana")    # removes first occurrence: O(n)
last = fruits.pop()        # removes and returns last item: O(1)
first = fruits.pop(0)      # removes and returns first item: O(n)
fruits.clear()             # removes all items: O(n)

# Searching and counting
colors = ["red", "blue", "red", "green", "blue", "red"]
print(colors.index("blue"))   # 1 (first occurrence)
print(colors.count("red"))    # 3

# Sorting
nums = [3, 1, 4, 1, 5, 9, 2, 6]
nums.sort()                    # in-place sort: [1, 1, 2, 3, 4, 5, 6, 9]
nums.sort(reverse=True)        # descending: [9, 6, 5, 4, 3, 2, 1, 1]
nums.sort(key=abs)             # sort by absolute value

# sorted() returns a new list, leaving the original unchanged
original = [3, 1, 4, 1, 5]
new_sorted = sorted(original)  # [1, 1, 3, 4, 5]
print(original)                # [3, 1, 4, 1, 5] (unchanged)

List Comprehensions

# Basic comprehension
squares = [x ** 2 for x in range(10)]   # [0, 1, 4, 9, 16, ...]

# With filter condition
evens = [x for x in range(20) if x % 2 == 0]

# With transformation and filter
words = ["hello", "world", "python", "is", "great"]
long_upper = [w.upper() for w in words if len(w) > 3]

# Nested comprehension (flattening)
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [num for row in matrix for num in row]  # [1, 2, ..., 9]

# Conditional expression (if/else in output)
labels = ["even" if x % 2 == 0 else "odd" for x in range(5)]

Slicing

nums = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

print(nums[2:7])     # [2, 3, 4, 5, 6]
print(nums[:4])      # [0, 1, 2, 3]
print(nums[6:])      # [6, 7, 8, 9]
print(nums[::2])     # [0, 2, 4, 6, 8] (every 2nd)
print(nums[::-1])    # [9, 8, ..., 0] (reversed)

# Slice assignment and deletion
nums[2:5] = [20, 30, 40]  # replace indices 2-4
del nums[2:5]               # remove indices 2-4

3. Tuples

Tuples are immutable ordered sequences. Once created, you cannot add, remove, or change elements. This immutability gives tuples two practical advantages: they are slightly faster and more memory-efficient than lists, and they can be used as dictionary keys and set elements.

Creation and Use Cases

# Creating tuples
point = (10, 20)
rgb = (255, 128, 0)
single = (42,)              # trailing comma required for single-element tuple
empty = ()
from_list = tuple([1, 2, 3])

# Accessing elements (same as lists)
print(point[0])    # 10
print(rgb[-1])     # 0

# Tuple unpacking
x, y = point
print(x, y)        # 10 20

# Extended unpacking with *
first, *middle, last = (1, 2, 3, 4, 5)
print(first)    # 1
print(middle)   # [2, 3, 4]
print(last)     # 5

# Tuples as dictionary keys (lists cannot do this)
grid = {}
grid[(0, 0)] = "origin"
grid[(1, 2)] = "point A"
print(grid[(0, 0)])   # "origin"

Named Tuples

from collections import namedtuple

# Create a named tuple class
Point = namedtuple("Point", ["x", "y"])
Color = namedtuple("Color", "red green blue")

p = Point(10, 20)
print(p.x, p.y)       # 10 20
print(p[0], p[1])     # 10 20 (index access still works)

c = Color(255, 128, 0)
print(c.red)           # 255
print(c._asdict())     # {'red': 255, 'green': 128, 'blue': 0}

p2 = p._replace(x=30)  # creates new instance (immutable)
print(p2)               # Point(x=30, y=20)

# typing.NamedTuple: modern alternative with type hints
from typing import NamedTuple

class Coordinate(NamedTuple):
    latitude: float
    longitude: float
    label: str = ""

loc = Coordinate(40.7128, -74.0060, "New York")
print(loc.label)  # "New York"

4. Dictionaries

Dictionaries are Python's most important data structure for real-world programming. They provide O(1) average-case lookups, insertions, and deletions by mapping keys to values. Since Python 3.7, dictionaries maintain insertion order as a language guarantee.

Creation and Access

# Creating dictionaries
person = {"name": "Alice", "age": 30, "city": "NYC"}
from_pairs = dict([("a", 1), ("b", 2)])
from_kwargs = dict(host="localhost", port=8080)

# Safe access with .get()
print(person["name"])                  # "Alice"
print(person.get("email", "N/A"))      # "N/A" (default if missing)

# Add, update, delete
person["email"] = "alice@example.com"  # add new key
person.update({"age": 32, "job": "dev"})  # update multiple
del person["city"]                      # delete key
email = person.pop("email", None)       # remove and return (safe)

Dictionary Comprehensions

# Basic comprehension
squares = {x: x ** 2 for x in range(6)}
# {0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

# Filtering
scores = {"Alice": 95, "Bob": 67, "Charlie": 88, "Diana": 72}
passing = {name: s for name, s in scores.items() if s >= 70}

# Invert a dictionary (swap keys and values)
inverted = {v: k for k, v in {"a": 1, "b": 2}.items()}  # {1: 'a', 2: 'b'}

# Word frequency counter
text = "the cat sat on the mat the cat"
word_count = {}
for word in text.split():
    word_count[word] = word_count.get(word, 0) + 1
# {'the': 3, 'cat': 2, 'sat': 1, 'on': 1, 'mat': 1}

defaultdict and OrderedDict

from collections import defaultdict, OrderedDict

# defaultdict: automatically provides default values for missing keys
word_groups = defaultdict(list)
for word in ["apple", "avocado", "banana", "blueberry", "cherry"]:
    word_groups[word[0]].append(word)
print(dict(word_groups))
# {'a': ['apple', 'avocado'], 'b': ['banana', 'blueberry'], 'c': ['cherry']}

counter = defaultdict(int)
for char in "mississippi":
    counter[char] += 1
print(dict(counter))
# {'m': 1, 'i': 4, 's': 4, 'p': 2}

nested = defaultdict(lambda: defaultdict(int))
nested["users"]["count"] += 1
print(nested["users"]["count"])  # 1

# OrderedDict: preserves insertion order (less needed since Python 3.7)
# Still useful for move_to_end() and equality comparisons that consider order
od = OrderedDict()
od["first"] = 1
od["second"] = 2
od["third"] = 3
od.move_to_end("first")          # moves to end
od.move_to_end("third", last=False)  # moves to beginning

Iterating and Merging

config = {"host": "localhost", "port": 8080, "debug": True}

for key, value in config.items():
    print(f"{key} = {value}")

# Merge dictionaries (Python 3.9+)
defaults = {"host": "localhost", "port": 80, "debug": False}
overrides = {"port": 8080, "debug": True}
merged = defaults | overrides    # {'host': 'localhost', 'port': 8080, 'debug': True}
defaults |= overrides            # in-place merge
merged = {**defaults, **overrides}  # pre-3.9 alternative

5. Sets

Sets are unordered collections of unique, hashable elements. They provide O(1) membership testing and support mathematical set operations like union, intersection, and difference. Use sets whenever you need to eliminate duplicates or perform set algebra.

Creation and Operations

# Creating sets
colors = {"red", "green", "blue"}
from_list = set([1, 2, 2, 3, 3, 3])   # {1, 2, 3}
empty_set = set()                       # NOT {} (that creates a dict)

# Add and remove
colors.add("yellow")
colors.discard("red")      # no error if missing
colors.remove("green")     # raises KeyError if missing

# Set operations
a = {1, 2, 3, 4, 5}
b = {4, 5, 6, 7, 8}
print(a | b)    # {1, 2, 3, 4, 5, 6, 7, 8}  union
print(a & b)    # {4, 5}                       intersection
print(a - b)    # {1, 2, 3}                    difference
print(a ^ b)    # {1, 2, 3, 6, 7, 8}           symmetric difference
print({1, 2} <= {1, 2, 3})  # True (subset)

Practical Set Patterns

# Remove duplicates while preserving order
items = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3]
seen = set()
unique = [x for x in items if x not in seen and not seen.add(x)]
print(unique)  # [3, 1, 4, 5, 9, 2, 6]

# Find common elements between lists
common = set(["python", "java", "go"]) & set(["go", "rust", "python"])
print(common)  # {'python', 'go'}

# frozenset: immutable set (usable as dict key)
fs = frozenset([1, 2, 3])
d = {fs: "frozen set as key"}

# Set comprehension
evens = {x for x in range(20) if x % 2 == 0}

6. Strings as Sequences

Strings in Python are immutable sequences of Unicode characters. They support indexing, slicing, iteration, and membership testing just like lists and tuples, but they cannot be modified in place.

text = "Python Data Structures"

# Sequence operations work on strings
print(len(text))        # 22
print(text[0])          # "P"
print(text[7:11])       # "Data"
print(text[::-1])       # "serutcurtS ataD nohtyP"
print("Data" in text)   # True

# Strings are immutable -- create new strings instead
lower = text.lower()          # "python data structures"
replaced = text.replace("Python", "Java")  # "Java Data Structures"
parts = text.split(" ")       # ["Python", "Data", "Structures"]
joined = "-".join(parts)      # "Python-Data-Structures"

# String methods for data processing
data = "  Alice, 30, engineer  "
fields = [f.strip() for f in data.strip().split(",")]  # ["Alice", "30", "engineer"]

7. The Collections Module

The collections module provides specialized data structures that solve common programming problems more efficiently or elegantly than the built-in types.

Counter

from collections import Counter

words = ["apple", "banana", "apple", "cherry", "banana", "apple"]
word_count = Counter(words)
# Counter({'apple': 3, 'banana': 2, 'cherry': 1})

print(word_count.most_common(2))  # [('apple', 3), ('banana', 2)]
print(word_count["apple"])         # 3
print(word_count["grape"])         # 0 (missing keys return 0)

char_count = Counter("mississippi")  # {'s': 4, 'i': 4, 'p': 2, 'm': 1}

# Arithmetic with Counters
a = Counter(["a", "b", "c", "a"])
b = Counter(["a", "b", "d"])
print(a + b)    # Counter({'a': 3, 'b': 2, 'c': 1, 'd': 1})
print(a - b)    # Counter({'a': 1, 'c': 1})

deque (Double-Ended Queue)

from collections import deque

# O(1) append and pop from both ends (list.insert(0,x) is O(n))
d = deque([1, 2, 3, 4, 5])
d.append(6)        # add to right
d.appendleft(0)    # add to left
d.pop()            # remove from right
d.popleft()        # remove from left
d.rotate(2)        # rotate right: deque([4, 5, 1, 2, 3])

# Fixed-size deque (automatically discards oldest items)
recent = deque(maxlen=3)
recent.append("a")   # deque(['a'])
recent.append("b")   # deque(['a', 'b'])
recent.append("c")   # deque(['a', 'b', 'c'])
recent.append("d")   # deque(['b', 'c', 'd']) -- 'a' is dropped

# Practical use: sliding window moving average
def moving_average(data, window_size):
    window = deque(maxlen=window_size)
    results = []
    for value in data:
        window.append(value)
        results.append(sum(window) / len(window))
    return results

ChainMap

from collections import ChainMap

# Combine multiple dicts into a single view (no copying)
defaults = {"color": "blue", "size": "medium", "font": "Arial"}
user_prefs = {"color": "red", "font": "Helvetica"}
cli_args = {"color": "green"}

config = ChainMap(cli_args, user_prefs, defaults)
print(config["color"])  # "green" (first dict wins)
print(config["size"])   # "medium" (falls through to defaults)

8. Arrays and the array Module

Python's array module provides a compact, type-constrained array for storing homogeneous numeric data. Unlike lists, which store Python objects, arrays store raw C-style values and use significantly less memory.

from array import array
import sys

# Create typed arrays (type code specifies the C type)
int_array = array("i", [1, 2, 3, 4, 5])    # signed int
float_array = array("d", [1.0, 2.5, 3.14]) # double float
# Type codes: 'b' char, 'i' int, 'I' unsigned int, 'f' float, 'd' double

# Operations work like lists
int_array.append(6)
print(int_array[0])   # 1

# Memory comparison: arrays use roughly half the memory of lists
py_list = list(range(1000))
py_array = array("i", range(1000))
print(sys.getsizeof(py_list))     # ~8856 bytes
print(sys.getsizeof(py_array))    # ~4064 bytes

# For serious numeric work, use NumPy instead (faster, more features)

9. Heapq and Priority Queues

The heapq module implements a min-heap on top of a regular list. It provides O(log n) push and pop operations and O(1) access to the smallest element. This is the standard way to implement priority queues in Python.

import heapq

# heapq operates on a regular list
heap = []
heapq.heappush(heap, 5)
heapq.heappush(heap, 1)
heapq.heappush(heap, 3)
print(heap[0])              # 1 (smallest element, O(1))
print(heapq.heappop(heap))  # 1 (removes and returns smallest, O(log n))

# Convert existing list to heap (in-place, O(n))
data = [5, 3, 8, 1, 9, 2]
heapq.heapify(data)

# Find n largest/smallest without full sort
numbers = [45, 12, 67, 3, 89, 23, 56, 78]
print(heapq.nlargest(3, numbers))    # [89, 78, 67]
print(heapq.nsmallest(3, numbers))   # [3, 12, 23]

# Priority queue with tuples (priority, data)
tasks = []
heapq.heappush(tasks, (3, "low priority"))
heapq.heappush(tasks, (1, "urgent"))
heapq.heappush(tasks, (2, "normal"))

while tasks:
    priority, task = heapq.heappop(tasks)
    print(f"[{priority}] {task}")
# [1] urgent -> [2] normal -> [3] low priority

# Max-heap trick: negate the values
max_heap = []
for val in [5, 1, 3]:
    heapq.heappush(max_heap, -val)
print(-heapq.heappop(max_heap))  # 5 (the largest)

10. Stacks and Queues

Python does not have dedicated stack or queue classes, but lists and deques implement these patterns efficiently.

Stack (LIFO: Last In, First Out)

# Stack using a list (append + pop are both O(1))
stack = []
stack.append("first")
stack.append("second")
stack.append("third")
print(stack.pop())    # "third"
print(stack.pop())    # "second"

# Practical example: balanced parentheses checker
def is_balanced(expression):
    pairs = {"(": ")", "[": "]", "{": "}"}
    stack = []
    for char in expression:
        if char in pairs:
            stack.append(char)
        elif char in pairs.values():
            if not stack or pairs[stack.pop()] != char:
                return False
    return len(stack) == 0

print(is_balanced("({[()]})"))  # True
print(is_balanced("({[})"))     # False

Queue (FIFO: First In, First Out)

from collections import deque

# Queue using deque (append + popleft are both O(1))
# WARNING: Do not use list.pop(0) for queues - it is O(n)
queue = deque()
queue.append("first")
queue.append("second")
queue.append("third")
print(queue.popleft())   # "first"
print(queue.popleft())   # "second"

# Thread-safe queue for multi-threaded programs
from queue import Queue, PriorityQueue

q = Queue(maxsize=10)
q.put("task1")
q.put("task2")
print(q.get())     # "task1"

pq = PriorityQueue()
pq.put((2, "normal"))
pq.put((1, "urgent"))
print(pq.get())    # (1, 'urgent')

11. When to Use Which Data Structure

Need	Use	Why
Ordered collection, frequent modifications	`list`	Mutable, O(1) append, indexed access
Fixed data that should not change	`tuple`	Immutable, hashable, less memory
Fast lookup by key	`dict`	O(1) average lookup, insertion, deletion
Unique elements, set operations	`set`	O(1) membership test, union/intersection
Queue with fast ends	`deque`	O(1) append/pop from both ends
Count occurrences	`Counter`	Built-in counting, most_common()
Default values for missing keys	`defaultdict`	Avoids KeyError, cleaner grouping code
Priority queue	`heapq`	O(log n) push/pop, O(1) min access
Struct-like records with named fields	`namedtuple`	Readable, immutable, memory-efficient
Compact numeric arrays	`array`	Half the memory of lists for numbers
Immutable set (for dict keys)	`frozenset`	Hashable set, usable as dict key
Layered config lookup	`ChainMap`	Search multiple dicts without merging

12. Performance Characteristics (Big-O)

Operation	list	tuple	dict	set	deque
Index access `[i]`	O(1)	O(1)	—	—	O(n)
Search (`in`)	O(n)	O(n)	O(1)	O(1)	O(n)
Append / add	O(1)*	—	O(1)	O(1)	O(1)
Insert at front	O(n)	—	—	—	O(1)
Pop from end	O(1)	—	—	—	O(1)
Pop from front	O(n)	—	—	—	O(1)
Delete by value	O(n)	—	O(1)	O(1)	O(n)
Sort	O(n log n)	—	—	—	—
Get min / max	O(n)	O(n)	O(n)	O(n)	O(n)
Length	O(1)	O(1)	O(1)	O(1)	O(1)

* Amortized O(1). Occasionally O(n) when the internal array needs to be resized.

# Performance difference: list search O(n) vs set search O(1)
import time

data_list = list(range(1_000_000))
data_set = set(range(1_000_000))

start = time.perf_counter()
999_999 in data_list                          # O(n): ~0.015s
print(f"List: {time.perf_counter() - start:.6f}s")

start = time.perf_counter()
999_999 in data_set                           # O(1): ~0.000001s
print(f"Set:  {time.perf_counter() - start:.6f}s")
# Set lookup is roughly 10,000x faster for 1M elements

13. Common Patterns and Best Practices

Grouping Data

from collections import defaultdict

# Group items by a key
students = [
    ("Alice", "Math"), ("Bob", "Science"), ("Charlie", "Math"),
    ("Diana", "Science"), ("Eve", "Math"), ("Frank", "Art")
]

by_subject = defaultdict(list)
for name, subject in students:
    by_subject[subject].append(name)

print(dict(by_subject))
# {'Math': ['Alice', 'Charlie', 'Eve'],
#  'Science': ['Bob', 'Diana'],
#  'Art': ['Frank']}

Caching with Dictionaries

# Manual memoization
cache = {}

def fibonacci(n):
    if n in cache:
        return cache[n]
    if n < 2:
        return n
    result = fibonacci(n - 1) + fibonacci(n - 2)
    cache[n] = result
    return result

print(fibonacci(50))  # 12586269025 (instant with cache)

# Better: use functools.lru_cache
from functools import lru_cache

@lru_cache(maxsize=128)
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

print(fibonacci(100))  # 354224848179261915075

Flattening and Transforming

# Flatten nested structures
from itertools import chain
nested = [[1, 2], [3, 4], [5, 6]]
flat = list(chain.from_iterable(nested))   # [1, 2, 3, 4, 5, 6]

# zip for parallel iteration
names = ["Alice", "Bob", "Charlie"]
ages = [30, 25, 35]
for name, age in zip(names, ages):
    print(f"{name}, {age}")

# Transpose rows to columns
rows = [(1, "a"), (2, "b"), (3, "c")]
numbers, letters = zip(*rows)
print(numbers)   # (1, 2, 3)
print(letters)   # ('a', 'b', 'c')

Avoiding Common Mistakes

# Mistake 1: Mutable default arguments
def add_item(item, items=None):   # use None, not []
    if items is None:
        items = []
    items.append(item)
    return items

# Mistake 2: Modifying a list while iterating
nums = [1, 2, 3, 4, 5]
# BAD: for n in nums: nums.remove(n)  # skips elements!
nums = [n for n in nums if n % 2 != 0]  # GOOD: create new list

# Mistake 3: Using list for membership testing
allowed = {"admin", "editor", "viewer"}   # use set, not list
# set lookup is O(1) vs list's O(n)

# Mistake 4: Copying references instead of values
a = [1, 2, 3]
b = a.copy()   # shallow copy (or list(a) or a[:])
b.append(4)
print(a)       # [1, 2, 3] (unchanged)

# For nested structures, use deepcopy
import copy
nested = [[1, 2], [3, 4]]
deep = copy.deepcopy(nested)

Frequently Asked Questions

What is the difference between a list and a tuple in Python?

Lists and tuples are both ordered sequences, but they differ in mutability. Lists are mutable: you can add, remove, and change elements after creation using methods like append(), remove(), and index assignment. Tuples are immutable: once created, their contents cannot be changed. Use lists when your data needs to change (a shopping cart, a queue of tasks) and tuples when it should remain constant (coordinates, RGB colors, database records). Tuples are slightly faster than lists, use less memory, and can be used as dictionary keys, while lists cannot.

When should I use a dictionary vs a list in Python?

Use a list when you have an ordered collection of items that you access by position (index) or iterate through sequentially. Use a dictionary when you need to look up values by a meaningful key, such as looking up a user by their username or a configuration value by its name. Dictionary lookups are O(1) on average regardless of size, while searching a list for a value is O(n). If you find yourself searching through a list repeatedly to find items, a dictionary is almost always the better choice.

What is the collections module in Python?

The collections module provides specialized container datatypes that extend Python's built-in data structures. Key classes include Counter for counting hashable objects, deque for double-ended queues with O(1) append and pop from both ends, defaultdict for dictionaries with automatic default values for missing keys, OrderedDict for dictionaries that remember insertion order, namedtuple for lightweight immutable objects with named fields, and ChainMap for combining multiple dictionaries into a single view. These are standard library tools that do not require any external installation.

How do I choose the right data structure in Python?

Consider four factors: the operations you need, ordering requirements, mutability, and uniqueness constraints. If you need fast lookups by key, use a dictionary. If you need an ordered sequence you will modify, use a list. If you need an ordered sequence that should not change, use a tuple. If you need to eliminate duplicates or perform set operations like union and intersection, use a set. If you need a queue with fast operations on both ends, use a deque. If you need a priority queue, use heapq. The right choice often comes down to which operations dominate your code and their Big-O performance characteristics.

What is the time complexity of common Python data structure operations?

For lists: indexing is O(1), append is amortized O(1), insert at position is O(n), search (in operator) is O(n), and sort is O(n log n). For dictionaries: get, set, and delete are all O(1) average case. For sets: add, remove, and membership testing (in operator) are all O(1) average case. For deques: append and pop from both ends are O(1), but indexing by position is O(n). For heaps (via heapq): push and pop are O(log n), and finding the minimum is O(1). Understanding these complexities helps you choose the right structure and avoid performance pitfalls in your code.

Conclusion

Python's data structures are the foundation of everything you build. Lists handle sequential data, dictionaries power lookups, sets eliminate duplicates, tuples provide immutable records, and the collections module fills the gaps with Counter, deque, defaultdict, and namedtuple. Choose your data structure based on the operations you perform most frequently. The Big-O table in section 12 should be your reference whenever you are uncertain.

⚙ Keep learning: Master the fundamentals in our Python Beginners Guide, dive into data analysis with the Pandas Complete Guide, and reference syntax quickly with our Python Cheat Sheet.

Related Resources

Python for Beginners: The Complete Guide — start here if you are new to Python
Python Pandas: The Complete Data Analysis Guide — use DataFrames and Series for tabular data
Python Cheat Sheet — quick reference for syntax and built-in functions
JSON: The Complete Guide — work with JSON data in Python dictionaries

Python Data Structures: The Complete Guide for 2026

Table of Contents

1. Overview of Python Data Structures

2. Lists

Creation and Basic Operations

List Methods

List Comprehensions

Slicing

3. Tuples

Creation and Use Cases

Named Tuples

4. Dictionaries

Creation and Access

Dictionary Comprehensions

defaultdict and OrderedDict

Iterating and Merging

5. Sets

Creation and Operations

Practical Set Patterns

6. Strings as Sequences

7. The Collections Module

Counter

deque (Double-Ended Queue)

ChainMap

8. Arrays and the array Module

9. Heapq and Priority Queues

10. Stacks and Queues

Stack (LIFO: Last In, First Out)

Queue (FIFO: First In, First Out)

11. When to Use Which Data Structure

12. Performance Characteristics (Big-O)

13. Common Patterns and Best Practices

Grouping Data

Caching with Dictionaries

Flattening and Transforming

Avoiding Common Mistakes

Frequently Asked Questions

Conclusion

Related Resources

Related Resources