Full Report
multipart/form-data is used for forms that include binary data, which can be broken into multiple parts. Each part has a boundary string (declared in the actual requests headers) that contains its own headers. The Content-Disposition sub header is used to define parameter name and filename content of the request. Content-Type is used to specify the media part of the content like a normal header as well. While creating a WAF in Lua, they noticed it was super easy to bypass the parsers for this. In the end, no parsers follow the RFC was well as they should. File validation is important for file extension checks, MIME type validation, size limitations and several other things. The rest of the article is bypasses for tricking WAFs and servers. In practice, the application/x-www-form-urlencoded can be used as the content of a multipart/form-data. Many WAFs do not support the multipart/form-data and will effectively ignore it. Since the WAF can't handle it by the server can, URL encoded data will be decoded on the backend but not by the WAF itself, giving a difference between check and usage. This was true of HAProxy, AWS WAF and AWS Lambda. The next bypass is the handling of duplicate parameters. Both name and filename can be duplicated - if this happens, one may parse the first while the other may parse the second. This is also true with the full header data. CRLF Parsing seems inadequate in many cases too. Some only delimit from \r\n\r\n while others will just use \r. Single quotes on parameter names instead of double quotes causes a similar effect. In PHP, if the closing boundary string is missing, it will parse fine while other things will not. The final bypass, and the authors favorite, has to do with a RFC update. With the update, filename* parameters allow for special characters and the ability to specify an encoding. For instance, filename*=utf-8''filename.pdf if s valid parameter. In practice, this allows for URL encoding the filename information which most WAFs are not going to do. They give an example of PHP file validation. The second half of the article are bypasses that they found in various engines. When running this on OpenResty with Lua in front, literally all of the bypasses above worked. In the case of the Nodejs library Busboy, it's super permissive. The filename parameter with and without the encoding supported created different priorities locally than in most servers, creating an easy bypass. In Flask, the trick for Busboy also works. Additionally, if separator of a semicolon is left out (only a space) in a header as a delimiter then Flask won't parse it either. Finally, duplicate disposition headers will use the first instead of the expected second. This was also true on FortiWeb WAF. The issue isn't a single implementation - the issue is that all of the different implementations do different things. By combining these parsers, similar to HTTP smuggling, we can bypass security protections. There are going to be more and more bug classes like this in the future for things that have 10's of parsers and are complicated. Good write up!
Analysis Summary
# Tool/Technique: Multipart Parser Differential Bypasses
## Overview
This technique exploits the lack of RFC compliance and logic inconsistencies across various `multipart/form-data` parsers. By crafting malformed or ambiguous HTTP requests, an attacker can cause a Web Application Firewall (WAF) to interpret the data differently than the backend server. This "impedance mismatch" allows malicious files (e.g., webshells) or exploit payloads to bypass security inspections.
## Technical Details
- **Type:** Technique (Protocol Manipulation / Filter Evasion)
- **Platform:** Web Applications (PHP, Node.js/Busboy, Python/Flask/Werkzeug, OpenResty/Lua)
- **Capabilities:** WAF evasion, file validation bypass, smuggling of malicious parameters.
- **First Seen:** Documented extensively in research published Nov 2024.
## MITRE ATT&CK Mapping
- **[TA0001 - Initial Access]**
- **[T1190 - Exploit Public-Facing Application]**
- **[TA0005 - Defense Evasion]**
- **[T1562 - Impair Defenses]** (Specifically bypassing WAF/IPS)
- **[T1027 - Obfuscated Files or Information]**
## Functionality
### Core Capabilities
- **Content-Type Confusion:** Using `application/x-www-form-urlencoded` data inside a `multipart/form-data` body. Some WAFs do not inspect multipart bodies for URL-encoded signatures, while backends (AWS WAF, Lambda, HAProxy) may decode them.
- **Parameter Duplication:** Providing multiple `name` or `filename` parameters within a single `Content-Disposition` header. One parser may take the first instance, while another takes the last.
- **Boundary Malformation:** Intentionally omitting closing boundary strings. PHP may parse these successfully while many WAFs drop the connection or ignore the "incomplete" part.
- **CRLF/LF Inconsistency:** Using non-standard line endings (just `\r` or mixed `\r\n`) to delimit headers, which causes some parsers to fail to identify the start of a file part.
### Advanced Features
- **RFC 2231/5987 Encoding Exploitation:** Using the `filename*` parameter (e.g., `filename*=utf-8''file.php`). Attackers can URL-encode the filename, which most WAFs fail to decode during inspection, bypassing extension blocklists.
- **Delimiter Omission:** In Flask/Python environments, removing the semicolon separator in headers (using only a space) can prevent the WAF from seeing the parameter while the backend still processes it.
- **Quoting Variations:** Using single quotes or no quotes for parameter values where double quotes are standard, causing regex-based WAFs to fail to capture the value.
## Indicators of Compromise
- **Network Indicators:**
- HTTP requests with `Content-Type: multipart/form-data` containing headers with `filename*=` (RFC 5987).
- Unexpected `application/x-www-form-urlencoded` strings inside multipart boundaries.
- **Behavioral Indicators:**
- Presence of multiple `Content-Disposition` headers in a single MIME part.
- Incomplete multipart bodies (missing the final `--boundary--` string).
- Headers using single quotes (`'`) for `name` or `filename` parameters.
## Associated Threat Actors
- This is a general exploitation class used by a wide range of actors, from script kiddies to APTs, for initial access via file upload vulnerabilities.
## Detection Methods
- **Protocol Validation:** Use WAF rules (like ModSecurity Rule 200003/200004) that alert on "Multipart parser detected a possible unmatched boundary" or "Strict Multipart Error."
- **Comparison Logic:** Monitor for requests where the number of parameters detected by the WAF differs from the number processed by the application logs.
- **Signature Detection:** Look for URL-encoded characters within the `filename*` parameter of a multipart request.
## Mitigation Strategies
- **Strict RFC Enforcement:** Configure WAFs and load balancers to reject any multipart request that does not strictly adhere to RFC 7578.
- **Normalization:** Ensure the security layer and the application layer use the same parsing engine or library.
- **Deep Inspection:** Implement "Magic Byte" (file signature) validation on the backend rather than relying on the `Content-Type` header or file extension.
- **Input Sanitization:** Rename all uploaded files to a randomly generated UUID on the server and strip any user-supplied extensions until they are validated.
## Related Tools/Techniques
- **HTTP Request Smuggling:** Similar concept applied to the `Content-Length` and `Transfer-Encoding` headers.
- **HPP (HTTP Parameter Pollution):** Exploiting how different engines handle duplicate parameters in GET/POST requests.
- **Busboy / Werkzeug Parsers:** Specific libraries mentioned as being highly permissive and vulnerable to these differentials.