XML: The Complete Developer's Guide for 2026
Table of Contents
- 1. What Is XML and Why It Still Matters
- 2. XML Syntax and Structure
- 3. Elements, Attributes, and Namespaces
- 4. DTD and XML Schema (XSD) Validation
- 5. XPath and XQuery
- 6. XSLT Transformations
- 7. XML vs JSON Comparison
- 8. XML in Modern Development
- 9. Parsing XML in JavaScript, Python, Go, and Java
- 10. XML Security
- 11. XML Tools and Libraries
- 12. Common XML Patterns and Best Practices
- 13. XML in APIs: SOAP vs REST
- 14. Frequently Asked Questions
Try Our XML to JSON Converter
Convert XML to JSON and back instantly with our free online XML to JSON Converter. Paste your XML, get clean JSON output - no installation required.
1. What Is XML and Why It Still Matters
XML (eXtensible Markup Language) is a markup language designed to store and transport data in a format that is both human-readable and machine-parseable. Created by the World Wide Web Consortium (W3C) in 1998, XML was designed to be a simplified subset of SGML (Standard Generalized Markup Language) that could be used on the web.
Despite the rise of JSON as the dominant data interchange format for web APIs, XML remains deeply embedded in the fabric of modern software development. If you have worked with any of the following, you have used XML:
- HTML and XHTML: The web itself is built on XML's cousin
- SVG graphics: Scalable Vector Graphics is an XML-based image format
- RSS and Atom feeds: Blog and news syndication relies on XML
- Android layouts: Android UI definitions use XML
- Maven and Gradle: Java build tools use pom.xml and XML configuration
- .NET configuration: Web.config and App.config are XML
- Microsoft Office formats: DOCX, XLSX, and PPTX are ZIP archives containing XML
- SOAP web services: Enterprise APIs built on XML messaging
- Sitemaps: Search engines consume XML sitemaps for crawling
XML matters in 2026 because it solves problems that simpler formats cannot. Its support for schemas, namespaces, and mixed content (combining structured data with free-form text) makes it the right tool for document-oriented data, enterprise integration, and any context where strict validation is required. While you might choose JSON for a REST API, you would reach for XML when you need to define a document format, validate data against a complex schema, or interoperate with legacy enterprise systems.
The key advantages of XML include:
- Self-describing: Tags provide meaning alongside the data
- Extensible: You can define your own tags and structure
- Validatable: DTD and XSD enforce data structure and types
- Transformable: XSLT can convert XML into HTML, PDF, or other formats
- Queryable: XPath and XQuery provide powerful data extraction
- Namespace-aware: Multiple vocabularies can coexist in one document
- Platform-independent: Supported by every major programming language
2. XML Syntax and Structure
Every XML document follows a well-defined structure. Understanding the syntax rules is essential because XML parsers are strict: a single error makes the entire document invalid.
The XML Declaration
An XML document typically begins with a declaration that specifies the version and encoding:
<?xml version="1.0" encoding="UTF-8"?>
This declaration is optional but recommended. The version attribute is required within the declaration (always "1.0" for practical purposes), and encoding specifies the character set. UTF-8 is the default and most common encoding.
Document Structure
Every XML document must have exactly one root element that contains all other elements:
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="fiction">
<title lang="en">The Great Gatsby</title>
<author>F. Scott Fitzgerald</author>
<year>1925</year>
<price>10.99</price>
</book>
<book category="non-fiction">
<title lang="en">Thinking, Fast and Slow</title>
<author>Daniel Kahneman</author>
<year>2011</year>
<price>14.99</price>
</book>
</bookstore>
Fundamental Syntax Rules
XML has strict syntax rules that must be followed for a document to be considered well-formed:
- All elements must have closing tags:
<title>...</title>(unlike HTML, where some tags like<br>can be self-closing) - Tags are case-sensitive:
<Title>and<title>are different elements - Elements must be properly nested:
<b><i>text</i></b>is correct;<b><i>text</b></i>is not - Attribute values must be quoted:
<book category="fiction">(single or double quotes) - Special characters must be escaped: Use entity references for reserved characters
Entity References
XML defines five predefined entity references for characters that have special meaning:
| Character | Entity Reference | Description |
|---|---|---|
< |
< |
Less than |
> |
> |
Greater than |
& |
& |
Ampersand |
' |
' |
Apostrophe |
" |
" |
Quotation mark |
CDATA Sections
When you need to include large blocks of text with special characters (such as code snippets), use a CDATA section to avoid escaping everything:
<script>
<![CDATA[
function compare(a, b) {
if (a < b && b > 0) {
return a & b;
}
}
]]>
</script>
Inside a CDATA section, everything is treated as plain text. The only sequence you cannot include is ]]>, which marks the end of the section.
Comments and Processing Instructions
XML supports comments (ignored by parsers) and processing instructions (directives to applications):
<!-- This is a comment. It cannot contain double hyphens (--). -->
<?xml-stylesheet type="text/xsl" href="transform.xsl"?>
<?app-config debug="true"?>
3. Elements, Attributes, and Namespaces
Understanding the distinction between elements and attributes, and when to use each, is one of the most important design decisions when creating XML structures.
Elements
Elements are the primary building blocks of XML. They can contain text, other elements, or a mix of both:
<!-- Element with text content -->
<name>Alice Johnson</name>
<!-- Element with child elements -->
<address>
<street>123 Main St</street>
<city>Austin</city>
<state>TX</state>
<zip>78701</zip>
</address>
<!-- Empty element (self-closing) -->
<linebreak />
<!-- Mixed content: text and elements interleaved -->
<paragraph>This is <bold>important</bold> text with <link href="/more">a link</link>.</paragraph>
Attributes
Attributes provide additional information about an element. They appear in the opening tag as name-value pairs:
<book isbn="978-0-13-468599-1" format="hardcover" edition="3">
<title>The Pragmatic Programmer</title>
</book>
Elements vs Attributes: When to Use Each
This is a perennial XML design question. Here are guidelines:
- Use elements for: Data that can have sub-structure, data that may repeat, data that is the primary content, and data that other applications need to process
- Use attributes for: Metadata about the element, identifiers (IDs), data types, rendering hints, and values that are always simple strings
<!-- Attribute approach -->
<person name="Alice" age="30" city="Austin" />
<!-- Element approach (usually preferred for data) -->
<person>
<name>Alice</name>
<age>30</age>
<city>Austin</city>
</person>
The element approach is generally more flexible because elements can later be extended with sub-elements, while attributes cannot contain structure.
Namespaces
XML namespaces solve the problem of name collisions when combining XML vocabularies from different sources. Without namespaces, a <title> element in a book catalog would conflict with a <title> element in an HTML document.
<?xml version="1.0" encoding="UTF-8"?>
<catalog xmlns:book="http://example.com/book"
xmlns:cd="http://example.com/cd">
<book:item>
<book:title>XML Developer's Guide</book:title>
<book:author>John Doe</book:author>
<book:price>44.95</book:price>
</book:item>
<cd:item>
<cd:title>Greatest Hits</cd:title>
<cd:artist>The Beatles</cd:artist>
<cd:price>19.99</cd:price>
</cd:item>
</catalog>
The namespace URI (like http://example.com/book) does not need to point to an actual web resource. It is simply a unique identifier. The prefix (book:, cd:) is a local alias for the namespace URI.
Default Namespaces
You can declare a default namespace to avoid prefixing every element:
<bookstore xmlns="http://example.com/bookstore">
<book>
<title>Clean Code</title>
<author>Robert C. Martin</author>
</book>
</bookstore>
All elements without a prefix inherit the default namespace. Attributes, however, never inherit the default namespace and must be explicitly prefixed if they belong to a namespace.
4. DTD and XML Schema (XSD) Validation
One of XML's greatest strengths is the ability to validate documents against formal definitions. This ensures that data conforms to an expected structure before processing it.
Document Type Definition (DTD)
DTD is the original XML validation mechanism. It uses a non-XML syntax to define allowed elements, attributes, and their relationships:
<!-- External DTD reference -->
<!DOCTYPE bookstore SYSTEM "bookstore.dtd">
<!-- bookstore.dtd -->
<!ELEMENT bookstore (book+)>
<!ELEMENT book (title, author, year, price)>
<!ATTLIST book category CDATA #REQUIRED>
<!ELEMENT title (#PCDATA)>
<!ATTLIST title lang CDATA #IMPLIED>
<!ELEMENT author (#PCDATA)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT price (#PCDATA)>
DTD element content models use these symbols:
#PCDATA- Parsed character data (text content)+- One or more occurrences*- Zero or more occurrences?- Zero or one occurrence,- Sequence (elements in this order)|- Choice (one element or the other)
DTD attribute types include:
CDATA- Character data (any string)ID- Unique identifierIDREF- Reference to an ID(val1|val2|val3)- Enumeration#REQUIRED- Attribute is mandatory#IMPLIED- Attribute is optional#FIXED "value"- Attribute has a fixed value
XML Schema (XSD)
XML Schema Definition (XSD) is the modern, more powerful alternative to DTD. It is itself written in XML and supports data types, namespaces, and complex constraints:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="bookstore">
<xs:complexType>
<xs:sequence>
<xs:element name="book" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="title" type="xs:string" />
<xs:element name="author" type="xs:string" />
<xs:element name="year" type="xs:gYear" />
<xs:element name="price">
<xs:simpleType>
<xs:restriction base="xs:decimal">
<xs:minInclusive value="0" />
<xs:maxInclusive value="9999.99" />
<xs:fractionDigits value="2" />
</xs:restriction>
</xs:simpleType>
</xs:element>
</xs:sequence>
<xs:attribute name="category" use="required">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="fiction" />
<xs:enumeration value="non-fiction" />
<xs:enumeration value="reference" />
</xs:restriction>
</xs:simpleType>
</xs:attribute>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
XSD advantages over DTD include:
- Data types: xs:string, xs:integer, xs:decimal, xs:date, xs:boolean, and 40+ built-in types
- Custom types: Define reusable simple and complex types
- Facets: Constrain values with patterns, ranges, lengths, and enumerations
- Namespace support: Full namespace awareness
- XML syntax: Schemas are themselves valid XML documents
- Inheritance: Types can extend or restrict other types
Format Your XML Instantly
Working with messy, unformatted XML? Use our XML Formatter & Beautifier to pretty-print your XML with proper indentation and syntax highlighting.
5. XPath and XQuery
XPath and XQuery are query languages that let you extract and manipulate data from XML documents, similar to how SQL works for relational databases.
XPath Basics
XPath uses path expressions to navigate XML document trees. Think of it as a file system for XML nodes:
<!-- Sample XML -->
<bookstore>
<book category="fiction">
<title lang="en">The Great Gatsby</title>
<author>F. Scott Fitzgerald</author>
<price>10.99</price>
</book>
<book category="non-fiction">
<title lang="en">Thinking, Fast and Slow</title>
<author>Daniel Kahneman</author>
<price>14.99</price>
</book>
<book category="fiction">
<title lang="fr">L'Etranger</title>
<author>Albert Camus</author>
<price>8.99</price>
</book>
</bookstore>
Common XPath expressions:
/bookstore/book <!-- All book elements under bookstore -->
/bookstore/book/title <!-- All title elements -->
/bookstore/book[1] <!-- First book element -->
/bookstore/book[last()] <!-- Last book element -->
/bookstore/book[@category] <!-- Books with a category attribute -->
/bookstore/book[@category='fiction'] <!-- Fiction books only -->
//title <!-- All title elements anywhere -->
//book[price>10] <!-- Books with price greater than 10 -->
//title[@lang='en'] <!-- English titles -->
//book/title/text() <!-- Text content of all titles -->
/bookstore/book[price>10]/title <!-- Titles of books costing more than 10 -->
XPath Axes
Axes define relationships between nodes, allowing you to navigate in any direction through the XML tree:
child::book <!-- Child elements named "book" (default axis) -->
parent::* <!-- Parent element -->
ancestor::bookstore <!-- Ancestor named "bookstore" -->
descendant::title <!-- All descendant title elements -->
following-sibling::book <!-- Following sibling book elements -->
preceding-sibling::book <!-- Preceding sibling book elements -->
attribute::category <!-- Category attribute (same as @category) -->
self::book <!-- Current node if it's a book element -->
XPath Functions
XPath includes functions for strings, numbers, booleans, and node sets:
<!-- String functions -->
//book[contains(title, 'Great')] <!-- Titles containing "Great" -->
//book[starts-with(title, 'The')] <!-- Titles starting with "The" -->
string-length(//book[1]/title) <!-- Length of first title -->
<!-- Numeric functions -->
sum(//book/price) <!-- Sum of all prices -->
count(//book) <!-- Number of book elements -->
<!-- Boolean functions -->
//book[not(@category='fiction')] <!-- Non-fiction books -->
//book[price>10 and @category='fiction'] <!-- Expensive fiction -->
XQuery
XQuery extends XPath with FLWOR expressions (For, Let, Where, Order by, Return), enabling SQL-like queries on XML:
xquery version "3.1";
(: List all fiction books sorted by price :)
for $book in /bookstore/book
where $book/@category = "fiction"
order by number($book/price) descending
return
<result>
<title>{$book/title/text()}</title>
<price>{$book/price/text()}</price>
</result>
(: Calculate average price by category :)
for $cat in distinct-values(/bookstore/book/@category)
let $books := /bookstore/book[@category = $cat]
return
<category name="{$cat}">
<count>{count($books)}</count>
<avgPrice>{avg($books/price)}</avgPrice>
</category>
6. XSLT Transformations
XSLT (eXtensible Stylesheet Language Transformations) is a language for transforming XML documents into other formats: HTML, plain text, other XML structures, or even PDF. It is one of XML's most powerful features.
Basic XSLT Stylesheet
An XSLT stylesheet is itself an XML document that defines template rules for matching and transforming XML nodes:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" indent="yes" />
<!-- Match the root element -->
<xsl:template match="/bookstore">
<html>
<body>
<h1>Book Catalog</h1>
<table border="1">
<tr>
<th>Title</th>
<th>Author</th>
<th>Price</th>
</tr>
<xsl:apply-templates select="book" />
</table>
</body>
</html>
</xsl:template>
<!-- Template for each book element -->
<xsl:template match="book">
<tr>
<td><xsl:value-of select="title" /></td>
<td><xsl:value-of select="author" /></td>
<td>$<xsl:value-of select="price" /></td>
</tr>
</xsl:template>
</xsl:stylesheet>
Conditional Logic and Loops
XSLT supports conditional processing and iteration:
<!-- Conditional: xsl:if -->
<xsl:template match="book">
<xsl:if test="price > 10">
<p class="expensive">
<xsl:value-of select="title" /> - $<xsl:value-of select="price" />
</p>
</xsl:if>
</xsl:template>
<!-- Multiple conditions: xsl:choose -->
<xsl:template match="book">
<xsl:choose>
<xsl:when test="price < 10">
<span class="budget">Budget</span>
</xsl:when>
<xsl:when test="price < 20">
<span class="mid">Mid-range</span>
</xsl:when>
<xsl:otherwise>
<span class="premium">Premium</span>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<!-- Sorting -->
<xsl:for-each select="book">
<xsl:sort select="price" data-type="number" order="ascending" />
<p><xsl:value-of select="title" />: $<xsl:value-of select="price" /></p>
</xsl:for-each>
Real-World XSLT Use Cases
- XML to HTML: Rendering XML data as web pages
- XML to XML: Converting between different XML schemas (e.g., data migration)
- XML to CSV/Text: Exporting data in flat formats
- Report generation: Converting XML data to formatted reports via XSL-FO (PDF)
- RSS feed rendering: Browsers use XSLT to display RSS feeds
7. XML vs JSON Comparison
The XML vs JSON debate is one of the most common in web development. Both are valid choices depending on your use case. Here is a detailed comparison:
| Feature | XML | JSON |
|---|---|---|
| Syntax | Verbose tags: <name>Alice</name> |
Concise: {"name": "Alice"} |
| Data Types | Everything is text (schema adds types) | Native: string, number, boolean, null, array, object |
| Attributes | Supported: <book isbn="123"> |
No concept of attributes |
| Namespaces | Full support for mixing vocabularies | Not supported |
| Comments | Supported: <!-- comment --> |
Not supported in standard JSON |
| Schema Validation | DTD, XSD, RELAX NG, Schematron | JSON Schema |
| Querying | XPath, XQuery (mature, powerful) | JSONPath, jq (simpler) |
| Transformation | XSLT (declarative, powerful) | No standard equivalent |
| Mixed Content | Supported: text + elements interleaved | Not supported |
| File Size | Larger (repeated opening/closing tags) | Smaller (30-50% less than XML typically) |
| Parsing Speed | Slower (complex parsing required) | Faster (simpler grammar) |
| Browser Support | DOMParser, XMLSerializer | Native JSON.parse/JSON.stringify |
| Dominant Use | Documents, enterprise, config, SOAP | Web APIs, config, data interchange |
When to Choose XML
- Document-oriented data with mixed content (text + markup)
- Enterprise integration requiring strict schema validation
- Systems using SOAP web services
- Data that needs XSLT transformations
- Formats requiring namespace disambiguation
- Industry standards that mandate XML (HL7 in healthcare, XBRL in finance)
When to Choose JSON
- REST APIs and web application data exchange
- Configuration files for modern tools
- Data-oriented content without mixed text
- JavaScript-heavy applications
- Mobile app backends
- Situations where bandwidth and performance matter
Convert Between XML and JSON
Need to switch between formats? Our XML to JSON Converter handles bidirectional conversion with support for attributes, namespaces, and nested structures. You can also use our JSON Formatter to clean up the output.
8. XML in Modern Development
XML is not a relic of the past. It powers critical systems across virtually every domain of software development in 2026.
Configuration Files
Many frameworks and tools rely on XML configuration:
<!-- Maven pom.xml -->
<project xmlns="http://maven.apache.org/POM/4.0.0">
<modelVersion>4.0.0</modelVersion>
<groupId>com.example</groupId>
<artifactId>my-app</artifactId>
<version>1.0.0</version>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
<version>3.2.0</version>
</dependency>
</dependencies>
</project>
<!-- .NET App.config -->
<configuration>
<appSettings>
<add key="DatabaseConnection" value="Server=localhost;Database=mydb" />
<add key="MaxRetries" value="3" />
</appSettings>
</configuration>
SOAP Web Services
SOAP (Simple Object Access Protocol) uses XML for its message format and WSDL (Web Services Description Language) for service definitions:
<!-- SOAP Request -->
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope"
xmlns:ws="http://example.com/webservice">
<soap:Header>
<ws:Authentication>
<ws:Token>abc123</ws:Token>
</ws:Authentication>
</soap:Header>
<soap:Body>
<ws:GetUserRequest>
<ws:UserId>42</ws:UserId>
</ws:GetUserRequest>
</soap:Body>
</soap:Envelope>
<!-- SOAP Response -->
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope"
xmlns:ws="http://example.com/webservice">
<soap:Body>
<ws:GetUserResponse>
<ws:User>
<ws:Id>42</ws:Id>
<ws:Name>Alice Johnson</ws:Name>
<ws:Email>alice@example.com</ws:Email>
</ws:User>
</ws:GetUserResponse>
</soap:Body>
</soap:Envelope>
RSS and Atom Feeds
Blog and news syndication uses XML-based RSS and Atom formats:
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
<channel>
<title>DevToolbox Blog</title>
<link>https://devtoolbox.dedyn.io/blog</link>
<description>Developer tools and tutorials</description>
<item>
<title>XML: The Complete Developer's Guide</title>
<link>https://devtoolbox.dedyn.io/blog/xml-complete-guide</link>
<description>Master XML with this comprehensive guide.</description>
<pubDate>Tue, 11 Feb 2026 00:00:00 GMT</pubDate>
</item>
</channel>
</rss>
SVG Graphics
SVG (Scalable Vector Graphics) is an XML-based format for vector images, widely used for icons, logos, and interactive graphics on the web:
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 100 100">
<circle cx="50" cy="50" r="40" fill="#e94560" />
<text x="50" y="55" text-anchor="middle" fill="white"
font-size="14">XML</text>
</svg>
Android Development
Android uses XML extensively for layouts, resources, and the manifest:
<!-- activity_main.xml -->
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
android:layout_width="match_parent"
android:layout_height="match_parent"
android:orientation="vertical"
android:padding="16dp">
<TextView
android:id="@+id/titleText"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:text="@string/app_name"
android:textSize="24sp" />
<Button
android:id="@+id/submitButton"
android:layout_width="match_parent"
android:layout_height="wrap_content"
android:text="@string/submit" />
</LinearLayout>
Sitemaps
Search engines use XML sitemaps to discover and crawl web pages:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.com/</loc>
<lastmod>2026-02-11</lastmod>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
</urlset>
Microsoft Office Formats
DOCX, XLSX, and PPTX files are ZIP archives containing XML files. Unzip a .docx file and you will find word/document.xml, word/styles.xml, and other XML components that define the document structure and content.
9. Parsing XML in JavaScript, Python, Go, and Java
Every major programming language provides libraries for parsing and generating XML. Here is how to work with XML in the four most popular languages for backend and full-stack development.
JavaScript (Browser and Node.js)
// Browser: DOMParser
const xmlString = `<bookstore>
<book category="fiction">
<title>The Great Gatsby</title>
<author>F. Scott Fitzgerald</author>
<price>10.99</price>
</book>
<book category="non-fiction">
<title>Thinking, Fast and Slow</title>
<author>Daniel Kahneman</author>
<price>14.99</price>
</book>
</bookstore>`;
const parser = new DOMParser();
const xmlDoc = parser.parseFromString(xmlString, "text/xml");
// Check for parse errors
const parseError = xmlDoc.querySelector("parsererror");
if (parseError) {
console.error("Parse error:", parseError.textContent);
}
// Query elements
const titles = xmlDoc.querySelectorAll("title");
titles.forEach(title => {
console.log(title.textContent);
});
// XPath query
const result = xmlDoc.evaluate(
"//book[@category='fiction']/title",
xmlDoc, null,
XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null
);
for (let i = 0; i < result.snapshotLength; i++) {
console.log(result.snapshotItem(i).textContent);
}
// Generate XML
const serializer = new XMLSerializer();
const xmlOutput = serializer.serializeToString(xmlDoc);
console.log(xmlOutput);
// Node.js: Using fast-xml-parser
// npm install fast-xml-parser
const { XMLParser, XMLBuilder } = require("fast-xml-parser");
const nodeParser = new XMLParser({
ignoreAttributes: false,
attributeNamePrefix: "@_"
});
const jsonObj = nodeParser.parse(xmlString);
console.log(JSON.stringify(jsonObj, null, 2));
// Convert back to XML
const builder = new XMLBuilder({
ignoreAttributes: false,
attributeNamePrefix: "@_",
format: true
});
const newXml = builder.build(jsonObj);
console.log(newXml);
Python
import xml.etree.ElementTree as ET
# Parse XML string
xml_string = """<bookstore>
<book category="fiction">
<title lang="en">The Great Gatsby</title>
<author>F. Scott Fitzgerald</author>
<price>10.99</price>
</book>
<book category="non-fiction">
<title lang="en">Thinking, Fast and Slow</title>
<author>Daniel Kahneman</author>
<price>14.99</price>
</book>
</bookstore>"""
root = ET.fromstring(xml_string)
# Iterate over elements
for book in root.findall("book"):
title = book.find("title").text
author = book.find("author").text
price = float(book.find("price").text)
category = book.get("category")
print(f"{title} by {author} ({category}) - ${price}")
# XPath queries
fiction_books = root.findall(".//book[@category='fiction']")
expensive = root.findall(".//book[price]")
# Parse from file
tree = ET.parse("books.xml")
root = tree.getroot()
# Create XML programmatically
new_root = ET.Element("library")
book = ET.SubElement(new_root, "book", attrib={"isbn": "123"})
title = ET.SubElement(book, "title")
title.text = "New Book"
author = ET.SubElement(book, "author")
author.text = "Jane Doe"
tree = ET.ElementTree(new_root)
ET.indent(tree, space=" ")
tree.write("output.xml", encoding="unicode", xml_declaration=True)
# Using lxml for advanced features (XPath 1.0, XSLT, validation)
from lxml import etree
doc = etree.fromstring(xml_string.encode())
# Full XPath support
titles = doc.xpath("//book[@category='fiction']/title/text()")
print(titles) # ['The Great Gatsby']
total = doc.xpath("sum(//price)")
print(f"Total: ${total}") # Total: $25.98
# Validate against XSD
schema_doc = etree.parse("bookstore.xsd")
schema = etree.XMLSchema(schema_doc)
is_valid = schema.validate(doc)
print(f"Valid: {is_valid}")
# Using xmltodict for JSON-like access
import xmltodict
data = xmltodict.parse(xml_string)
print(data["bookstore"]["book"][0]["title"]) # The Great Gatsby
# Convert back to XML
xml_output = xmltodict.unparse(data, pretty=True)
print(xml_output)
Go
package main
import (
"encoding/xml"
"fmt"
"os"
)
// Define structs with XML tags
type Bookstore struct {
XMLName xml.Name `xml:"bookstore"`
Books []Book `xml:"book"`
}
type Book struct {
XMLName xml.Name `xml:"book"`
Category string `xml:"category,attr"`
Title string `xml:"title"`
Author string `xml:"author"`
Price float64 `xml:"price"`
}
func main() {
xmlData := []byte(`<bookstore>
<book category="fiction">
<title>The Great Gatsby</title>
<author>F. Scott Fitzgerald</author>
<price>10.99</price>
</book>
<book category="non-fiction">
<title>Thinking, Fast and Slow</title>
<author>Daniel Kahneman</author>
<price>14.99</price>
</book>
</bookstore>`)
// Unmarshal (parse) XML
var bookstore Bookstore
err := xml.Unmarshal(xmlData, &bookstore)
if err != nil {
fmt.Println("Error:", err)
return
}
for _, book := range bookstore.Books {
fmt.Printf("%s by %s (%s) - $%.2f\n",
book.Title, book.Author, book.Category, book.Price)
}
// Marshal (generate) XML
newStore := Bookstore{
Books: []Book{
{Category: "tech", Title: "Go Programming", Author: "John", Price: 39.99},
},
}
output, _ := xml.MarshalIndent(newStore, "", " ")
fmt.Println(xml.Header + string(output))
// Stream parsing with xml.Decoder (for large files)
file, _ := os.Open("large.xml")
defer file.Close()
decoder := xml.NewDecoder(file)
for {
token, err := decoder.Token()
if err != nil {
break
}
switch t := token.(type) {
case xml.StartElement:
if t.Name.Local == "book" {
var book Book
decoder.DecodeElement(&book, &t)
fmt.Println(book.Title)
}
}
}
}
Java
import javax.xml.parsers.*;
import org.w3c.dom.*;
import javax.xml.xpath.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import java.io.*;
public class XMLExample {
public static void main(String[] args) throws Exception {
String xmlString = "<bookstore>" +
"<book category=\"fiction\">" +
"<title>The Great Gatsby</title>" +
"<author>F. Scott Fitzgerald</author>" +
"<price>10.99</price></book></bookstore>";
// DOM Parser
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(
new ByteArrayInputStream(xmlString.getBytes()));
// Get elements by tag name
NodeList books = doc.getElementsByTagName("book");
for (int i = 0; i < books.getLength(); i++) {
Element book = (Element) books.item(i);
String title = book.getElementsByTagName("title")
.item(0).getTextContent();
String category = book.getAttribute("category");
System.out.println(title + " (" + category + ")");
}
// XPath queries
XPathFactory xpathFactory = XPathFactory.newInstance();
XPath xpath = xpathFactory.newXPath();
String title = xpath.evaluate(
"//book[@category='fiction']/title", doc);
System.out.println("Fiction: " + title);
NodeList fictionBooks = (NodeList) xpath.evaluate(
"//book[@category='fiction']", doc,
XPathConstants.NODESET);
// Create XML document
Document newDoc = builder.newDocument();
Element root = newDoc.createElement("library");
newDoc.appendChild(root);
Element bookEl = newDoc.createElement("book");
bookEl.setAttribute("isbn", "978-0-123456-78-9");
root.appendChild(bookEl);
Element titleEl = newDoc.createElement("title");
titleEl.setTextContent("New Book");
bookEl.appendChild(titleEl);
// Write to string
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
StringWriter writer = new StringWriter();
transformer.transform(
new DOMSource(newDoc), new StreamResult(writer));
System.out.println(writer.toString());
}
}
// SAX Parser for large files (event-based, low memory)
import org.xml.sax.*;
import org.xml.sax.helpers.DefaultHandler;
public class SAXExample extends DefaultHandler {
private StringBuilder content = new StringBuilder();
private String currentElement;
@Override
public void startElement(String uri, String localName,
String qName, Attributes attrs) {
currentElement = qName;
if (qName.equals("book")) {
System.out.println("Category: " + attrs.getValue("category"));
}
}
@Override
public void characters(char[] ch, int start, int length) {
content.append(ch, start, length);
}
@Override
public void endElement(String uri, String localName, String qName) {
if (qName.equals("title")) {
System.out.println("Title: " + content.toString().trim());
}
content.setLength(0);
}
}
10. XML Security
XML processing introduces security risks that every developer must understand. The most critical vulnerability is the XML External Entity (XXE) attack, which has been in the OWASP Top 10 list of web application security risks.
XXE (XML External Entity) Attacks
XXE exploits XML parsers that process external entity declarations in DTDs. An attacker can use this to read local files, perform server-side request forgery (SSRF), or cause denial of service:
<!-- Malicious XML that reads /etc/passwd -->
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<userInfo>
<name>&xxe;</name>
</userInfo>
<!-- SSRF: Make the server request an internal URL -->
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://internal-server:8080/admin">
]>
<data>&xxe;</data>
<!-- Billion Laughs (denial of service via entity expansion) -->
<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
<!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
]>
<data>&lol4;</data>
Preventing XXE in Every Language
# Python: Use defusedxml (ALWAYS for untrusted input)
# pip install defusedxml
import defusedxml.ElementTree as ET
# Safe parsing - blocks external entities, DTDs, etc.
root = ET.fromstring(untrusted_xml)
# Or manually disable in lxml:
from lxml import etree
parser = etree.XMLParser(
resolve_entities=False,
no_network=True,
dtd_validation=False,
load_dtd=False
)
doc = etree.fromstring(untrusted_xml.encode(), parser)
// Java: Disable external entities and DTDs
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
// Disable external entities
factory.setFeature(
"http://apache.org/xml/features/disallow-doctype-decl", true);
factory.setFeature(
"http://xml.org/sax/features/external-general-entities", false);
factory.setFeature(
"http://xml.org/sax/features/external-parameter-entities", false);
factory.setFeature(
XMLConstants.FEATURE_SECURE_PROCESSING, true);
factory.setXIncludeAware(false);
factory.setExpandEntityReferences(false);
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(inputStream);
// JavaScript: DOMParser is safe by default in browsers.
// In Node.js with fast-xml-parser, external entities are
// not processed by default. If using libxmljs:
const libxmljs = require("libxmljs");
const doc = libxmljs.parseXml(xmlString, {
noent: false, // Do NOT expand entities
nonet: true, // Do NOT fetch from network
dtdload: false // Do NOT load external DTDs
});
// Go: encoding/xml does NOT support external entities,
// so it is safe by default. If using a third-party
// library, verify that external entity processing
// is disabled.
XML Input Validation Best Practices
- Disable DTD processing: If you do not need DTDs, disable them entirely
- Disable external entities: Always disable processing of external entities in untrusted input
- Set entity expansion limits: Prevent "Billion Laughs" denial-of-service attacks
- Validate against a schema: Use XSD validation to reject malformed input early
- Sanitize before parsing: Remove DOCTYPE declarations from untrusted XML if DTD features are not needed
- Use safe libraries: Python's defusedxml, Java's OWASP recommendations, and Go's encoding/xml are safe defaults
- Limit input size: Reject XML documents that exceed expected size limits
- Log and monitor: Track XML parsing errors and suspicious patterns
XML Signature and Encryption
For secure XML communication (common in SOAP and SAML), XML Signature (XMLDSig) and XML Encryption (XMLEnc) provide:
- XML Signature: Digitally sign XML documents or specific elements to ensure integrity and authenticity
- XML Encryption: Encrypt XML elements or entire documents to protect confidentiality
- SAML: Security Assertion Markup Language uses XML signatures for single sign-on (SSO) authentication
- WS-Security: SOAP message-level security using signatures and encryption
11. XML Tools and Libraries
Online Tools
- DevToolbox XML to JSON Converter - Convert between XML and JSON formats instantly
- DevToolbox XML Formatter - Pretty-print and beautify XML with syntax highlighting
- DevToolbox JSON Formatter - Format and validate JSON output from conversions
- DevToolbox JSON Validator - Validate JSON structure and syntax
Command-Line Tools
# xmllint - Part of libxml2, available on most systems
xmllint --format messy.xml # Pretty-print XML
xmllint --valid --dtdvalid book.dtd books.xml # Validate against DTD
xmllint --schema book.xsd books.xml # Validate against XSD
xmllint --xpath "//book/title" books.xml # Run XPath query
xmllint --noent --nonet input.xml # Safe parsing
# xmlstarlet - Swiss Army knife for XML
xmlstarlet sel -t -v "//book/title" books.xml # Select values
xmlstarlet ed -u "//price" -v "19.99" books.xml # Edit values
xmlstarlet val -s book.xsd books.xml # Validate
xmlstarlet fo books.xml # Format
# xsltproc - Apply XSLT transformations
xsltproc transform.xsl input.xml > output.html
# Saxon - Advanced XSLT 2.0/3.0 and XQuery processor
java -jar saxon.jar -s:input.xml -xsl:transform.xsl -o:output.html
Libraries by Language
| Language | Standard Library | Popular Third-Party |
|---|---|---|
| JavaScript | DOMParser, XMLSerializer | fast-xml-parser, xml2js, cheerio, xmldom |
| Python | xml.etree.ElementTree, xml.dom.minidom | lxml, defusedxml, xmltodict, BeautifulSoup |
| Go | encoding/xml | etree, xmlquery, goxml |
| Java | javax.xml (DOM, SAX, StAX, JAXB) | Jackson XML, JDOM2, XOM, dom4j |
| C#/.NET | System.Xml, System.Xml.Linq (LINQ to XML) | XmlDocument, XDocument |
| Rust | - | quick-xml, xml-rs, roxmltree |
| PHP | SimpleXML, DOMDocument | sabre/xml, FluidXML |
IDE Support
- VS Code: XML extension by Red Hat provides validation, formatting, auto-completion, and XSD/DTD support
- IntelliJ IDEA: Built-in XML support with schema validation and XPath evaluation
- Oxygen XML Editor: Dedicated XML IDE with XSLT debugging, schema design, and diff tools
- XMLSpy: Enterprise XML editor with graphical schema design and XSLT profiling
12. Common XML Patterns and Best Practices
1. Use Meaningful Element Names
Choose descriptive, consistent names that convey the data's meaning:
<!-- Bad: Cryptic abbreviations -->
<r><fn>Alice</fn><ln>Smith</ln><a>30</a></r>
<!-- Good: Clear, readable names -->
<person>
<firstName>Alice</firstName>
<lastName>Smith</lastName>
<age>30</age>
</person>
2. Follow a Naming Convention
Pick one convention and be consistent throughout your XML vocabulary:
- camelCase:
<firstName>(common in Java-based systems) - PascalCase:
<FirstName>(common in .NET) - kebab-case:
<first-name>(common in web standards like SVG, HTML) - snake_case:
<first_name>(common in some data exchange formats)
3. Design for Extensibility
Structure your XML so that new elements can be added without breaking existing consumers:
<!-- Use wrapper elements for collections -->
<order>
<items>
<item>...</item>
<item>...</item>
</items>
<!-- New sections can be added without affecting items -->
<shipping>...</shipping>
</order>
4. Use Schemas for Validation
Always define an XSD schema for your XML formats, especially in multi-system integrations. Validate all incoming XML against the schema before processing:
# Python: Validate on every parse
from lxml import etree
schema_doc = etree.parse("order.xsd")
schema = etree.XMLSchema(schema_doc)
def parse_order(xml_string):
doc = etree.fromstring(xml_string.encode())
if not schema.validate(doc):
errors = schema.error_log.filter_from_errors()
raise ValueError(f"Invalid XML: {errors}")
return doc
5. Handle Encoding Correctly
Always specify UTF-8 encoding in the XML declaration and in your code:
<?xml version="1.0" encoding="UTF-8"?>
When writing XML in code, ensure your output stream uses UTF-8 encoding. Mismatched encoding declarations are a common source of parsing errors.
6. Prefer Elements Over Attributes for Data
Use attributes for metadata (IDs, types, formats) and elements for actual data content. This makes the XML more extensible and easier to query:
<!-- Good: Data in elements, metadata in attributes -->
<product id="P001" status="active">
<name>Widget Pro</name>
<description>A professional-grade widget</description>
<price currency="USD">29.99</price>
</product>
7. Keep Documents Focused
Each XML document should represent one logical entity or collection. Avoid mixing unrelated data in a single document. Use separate documents or separate XML namespaces for distinct concerns.
8. Version Your XML Formats
Include a version attribute on the root element to support format evolution:
<api-response version="2.1">
<!-- Consumers can handle different versions -->
</api-response>
9. Use Indentation and Formatting
While whitespace is often insignificant in XML, proper indentation makes documents readable for debugging and maintenance. Use our XML Formatter to automatically indent your XML.
10. Document Your XML Schema
XSD supports annotations for documenting elements and types:
<xs:element name="price">
<xs:annotation>
<xs:documentation>
The retail price in the currency specified by the
currency attribute. Must be a positive decimal with
at most 2 fractional digits.
</xs:documentation>
</xs:annotation>
<xs:simpleType>
<xs:restriction base="xs:decimal">
<xs:minExclusive value="0" />
<xs:fractionDigits value="2" />
</xs:restriction>
</xs:simpleType>
</xs:element>
13. XML in APIs: SOAP vs REST
XML plays different roles in the two dominant API paradigms. Understanding these roles helps you make informed architectural decisions.
SOAP APIs
SOAP (Simple Object Access Protocol) is an XML-based messaging protocol that provides a standardized way for applications to communicate. SOAP APIs are defined by:
- WSDL (Web Services Description Language): An XML document describing the service's operations, parameters, and data types
- SOAP Envelope: The XML message format with Header and Body sections
- WS-* Standards: Extensions for security (WS-Security), reliability (WS-ReliableMessaging), transactions (WS-AtomicTransaction), and more
<!-- WSDL Service Definition -->
<definitions xmlns="http://schemas.xmlsoap.org/wsdl/"
xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"
xmlns:tns="http://example.com/calculator">
<message name="AddRequest">
<part name="a" type="xsd:integer" />
<part name="b" type="xsd:integer" />
</message>
<message name="AddResponse">
<part name="result" type="xsd:integer" />
</message>
<portType name="CalculatorPortType">
<operation name="Add">
<input message="tns:AddRequest" />
<output message="tns:AddResponse" />
</operation>
</portType>
<binding name="CalculatorBinding" type="tns:CalculatorPortType">
<soap:binding style="document"
transport="http://schemas.xmlsoap.org/soap/http" />
<operation name="Add">
<soap:operation soapAction="http://example.com/Add" />
</operation>
</binding>
</definitions>
SOAP Strengths
- Formal contracts: WSDL provides machine-readable API definitions
- Built-in error handling: SOAP Fault elements provide standardized error reporting
- Transport independence: Can work over HTTP, SMTP, JMS, and other protocols
- Enterprise features: WS-Security, WS-ReliableMessaging, WS-AtomicTransaction
- Strong typing: XSD schemas enforce data types on both ends
REST APIs with XML
REST APIs typically use JSON, but many support XML as an alternative response format. Clients request XML using the Accept: application/xml header:
GET /api/users/42 HTTP/1.1
Host: api.example.com
Accept: application/xml
HTTP/1.1 200 OK
Content-Type: application/xml
<user>
<id>42</id>
<name>Alice Johnson</name>
<email>alice@example.com</email>
<role>admin</role>
<createdAt>2026-01-15T10:30:00Z</createdAt>
</user>
SOAP vs REST Comparison
| Aspect | SOAP | REST (with JSON) |
|---|---|---|
| Format | XML only | JSON, XML, or others |
| Protocol | Protocol-agnostic (HTTP, SMTP, JMS) | HTTP only |
| Contract | WSDL (strict, machine-readable) | OpenAPI/Swagger (optional) |
| Error Handling | SOAP Fault (standardized) | HTTP status codes + body |
| Security | WS-Security (message-level) | HTTPS + OAuth/JWT (transport-level) |
| State | Can be stateful (WS-Session) | Stateless by design |
| Complexity | High (tooling required) | Low (curl-friendly) |
| Performance | Higher overhead (XML parsing, envelope) | Lower overhead (JSON is smaller) |
| Use Cases | Banking, healthcare, government, ERP | Web apps, mobile, microservices |
When to Use SOAP in 2026
- Enterprise integrations: When interoperating with legacy systems that expose SOAP endpoints
- Regulated industries: Banking (ISO 20022), healthcare (HL7), and government systems often mandate SOAP
- Complex transactions: When you need distributed transactions, reliable messaging, or message-level security
- Strong contracts: When both parties need strict, machine-verifiable API contracts
When to Use REST with JSON
- Web and mobile apps: Lighter weight, easier to consume in JavaScript
- Microservices: Simpler inter-service communication
- Public APIs: Lower barrier to entry for third-party developers
- New projects: Unless you have a specific need for SOAP features
Working with XML and JSON APIs?
Use our XML to JSON Converter to translate between API response formats. Need to validate your JSON output? Try the JSON Validator.
Conclusion
XML is far from obsolete. While JSON has become the dominant format for web APIs and lightweight data exchange, XML continues to power critical infrastructure across enterprise systems, document standards, build tools, mobile development, and the web itself. Its strengths in schema validation, namespace management, mixed content, and transformation make it the right choice for many use cases that simpler formats cannot address.
As a developer in 2026, you will encounter XML whether you seek it out or not. The key is knowing when to use it (document markup, strict validation, enterprise integration) and when to prefer alternatives (REST APIs, simple configuration, data interchange). Understanding XML's full ecosystem, from XPath queries to XSLT transformations to XXE prevention, makes you more effective when you do need to work with it.
Whether you are parsing SOAP responses, generating SVG graphics, configuring a Maven build, or converting XML data to JSON, the tools and patterns covered in this guide will serve you well.
Start Working with XML
Put your XML knowledge into practice. Convert XML to JSON with our XML to JSON Converter, format messy XML with our XML Formatter, or validate your JSON output with the JSON Validator - all free, all in your browser.
Frequently Asked Questions
Is XML still used in 2026?
Yes, XML remains widely used in 2026. It powers SOAP web services, RSS/Atom feeds, SVG graphics, XHTML documents, Android layouts, Maven/Gradle build files, .NET configuration, Microsoft Office formats (OOXML), and many enterprise integration systems. While JSON dominates REST APIs, XML's schema validation, namespace support, and mixed-content capabilities make it irreplaceable in many domains.
What is the difference between XML and JSON?
XML uses opening and closing tags to structure data and supports attributes, namespaces, comments, and mixed content. JSON uses key-value pairs with curly braces and square brackets, supporting native data types like numbers, booleans, and null. XML is more verbose but offers schema validation (XSD), transformation (XSLT), and querying (XPath). JSON is lighter, easier to parse in JavaScript, and dominant in REST APIs. Choose XML when you need document markup, strict validation, or enterprise integration; choose JSON for web APIs and lightweight data exchange.
What is an XXE attack and how do I prevent it?
An XML External Entity (XXE) attack exploits XML parsers that process external entity declarations. Attackers can read local files, perform server-side request forgery (SSRF), or cause denial of service. Prevent XXE by disabling external entity processing and DTD loading in your XML parser. In Python, use defusedxml. In Java, set XMLConstants.FEATURE_SECURE_PROCESSING and disable DOCTYPE declarations. In JavaScript, most modern parsers like DOMParser are safe by default. Always validate and sanitize XML input from untrusted sources.
What is XPath and when should I use it?
XPath (XML Path Language) is a query language for selecting nodes from XML documents. It uses path expressions similar to file system paths, such as /bookstore/book/title to select all title elements under book elements. Use XPath when you need to extract specific data from XML documents, navigate complex XML trees, or write XSLT transformations. XPath supports predicates for filtering, functions for string manipulation, and axes for traversing node relationships.
Should I use DTD or XML Schema (XSD) for validation?
XML Schema (XSD) is preferred over DTD for most use cases. XSD supports data types (integers, dates, decimals), complex type definitions, namespaces, and is itself written in XML. DTD uses a non-XML syntax, lacks data type support, and has limited expressiveness. However, DTD is simpler for basic validation needs and is still used in legacy systems. For new projects, always choose XSD. For even more expressive validation, consider RELAX NG or Schematron.
How do I convert XML to JSON?
You can convert XML to JSON using libraries like xmltodict in Python, xml2js or fast-xml-parser in Node.js, or online tools like the DevToolbox XML to JSON Converter. The conversion maps XML elements to JSON objects and attributes to special keys (often prefixed with @). Be aware that XML features like attributes, namespaces, mixed content, and repeated elements with the same name require conventions to represent in JSON, since JSON has no direct equivalent for these constructs.