Phase 1: FOUNDATION — What is XML, What is a Parser, and What Happens Before XXE Exists


To understand XML External Entity (XXE), you must first understand what XML is, what a parser does, and why external entities were introduced. Everything about XXE is a consequence of how XML is structured and how it’s interpreted. We’re not touching any vulnerabilities yet. First, we study the foundation from first principles.


🧱 I. What is XML?


XML (eXtensible Markup Language) is a way to store and transport data in a structured, human-readable format.

It is not a programming language. It is a markup format — like HTML — but instead of defining how things look, it defines what things mean.

Example:

<user>
    <name>Isaac</name>
    <role>admin</role>
</user>

This means there is a user object with two properties: name and role. There is no logic here — just structure.


🧠 II. Why Does XML Exist?


Originally created in 1996, XML was built to do what JSON does now: exchange data between different systems (e.g., banks, servers, APIs, devices).

It was adopted everywhere: SOAP APIs, SAML, DOCX/XLSX internals, SVG images, RSS feeds, Office formats, old mobile configs, etc.