Full Report
Last year, a web cache deception caching vulnerability was discovered in ChatGPT. The vulnerability was that anything ending in a particular file type was cached but it had a fuzzy path resolution. By combining these together it was possible to get a credential file cached for a particular user by clicking on a link. So, this was fixed, but how well? The author was testing a new share feature that allowed users to share their chats with others. They noticed that the shared chat wouldn't update when the chat updated. So, a caching issue maybe? They saw a Cf-Cache-Status: HIT header in the URL, which made them think that something was off here. After playing around with the requests, they realized that anything under /share/ was being cached by Cloudflare. Prior to going to the web server, the CDN Cloudflare was doing the processing and performing the caching based upon the provided rules. From messing around with it, they noticed that if a path was URL encoded it was NOT being processed by Cloudflare but WAS being processed by the server. This desync in processing allowed for some weirdness to happen. What if we could get Cloudflare to think that the response was for /share/ to cache it but this came from a different path in reality? This is what the author came up with: https://chat.openai.com/share/%2F..%2Fapi/auth/session?cachebuster=123 The %2F..%2F after the share will be processed by the backend server but NOT the CDN. So, the CDN thinks we're at /share but the backend thinks we're at a different path. By setting this to be the /auth/session path, this creates the same web cache deception vulnerability as before. The author has a really good image on describing how the flow of this attack works to steal the session information via forcing something to be cached that shouldn't be. Overall, I absolutely love this bug! It's interesting to see how things are cached. I wonder if there's a tool to figure out where caches live that could be added to Burp Suite?
Analysis Summary
# Vulnerability: Wildcard Web Cache Deception via Path traversal URL Parser Confusion
## CVE Details
- **CVE ID**: Not explicitly assigned in the report (OpenAI typically coordinates via Bugcrowd).
- **CVSS Score**: Estimated 8.8 (High/Critical)
- **CWE**: CWE-444: Inconsistent Interpretation of HTTP Requests ("HTTP Request/Response Smuggling") / CWE-524: Use of Cache-containing Sensitive Information.
## Affected Systems
- **Products**: OpenAI ChatGPT.
- **Versions**: Production environment as of February 2024 (now patched).
- **Configurations**: Systems utilizing a CDN (Cloudflare) with aggressive caching rules (e.g., prefix-based caching `/share/*`) and a backend server that handles URL-encoded path normalization differently than the CDN.
## Vulnerability Description
The flaw stems from a **path normalization desynchronization** between Cloudflare (the CDN) and the ChatGPT origin server.
1. **CDN Behavior**: Cloudflare was configured to cache all responses under the hxxps[://]chat[.]openai[.]com/share/* path. Crucially, Cloudflare did not decode or normalize URL-encoded characters like `%2F` (forward slash) in the path when checking cache rules.
2. **Origin Server Behavior**: The backend web server *did* decode and normalize `%2F..%2F` as a path traversal (`/../`).
3. **The Core Issue**: By requesting `/share/%2F..%2Fapi/auth/session`, the CDN treats the request as a resource located within the "safe-to-cache" `/share/` directory. However, the origin server interprets the path traversal, moving up one level and serving the sensitive `/api/auth/session` endpoint.
## Exploitation
- **Status**: PoC available/Reported and remediated (Bug Bounty awarded).
- **Complexity**: Low.
- **Attack Vector**: Network.
- **Mechanism**: An attacker entices a logged-in victim to click a specially crafted link containing a cache-buster (e.g., `hxxps[://]chat[.]openai[.]com/share/%2F..%2Fapi/auth/session?cachebuster=123`). This forces the victim's session token into the CDN cache. The attacker then visits the same URL to retrieve the cached session data.
## Impact
- **Confidentiality**: High (Exposure of session tokens, chat history, and billing information).
- **Integrity**: High (Full account takeover possible).
- **Availability**: Low.
## Remediation
### Patches
- **OpenAI Fix**: OpenAI updated the caching configurations and path normalization handling at the edge. The specific "share" feature caching rules were restricted to prevent non-static traversal triggers.
### Workarounds
- **Strict Normalization**: Ensure CDNs and Origin servers use identical normalization logic for URL-encoded characters.
- **Cache-Control Headers**: Explicitly set `Cache-Control: no-store, private` for all sensitive API endpoints, as these headers usually override CDN path-based rules.
## Detection
- **Indicators of Compromise**:
- Unusual requests in logs containing `%2F..%2F` or encoded traversals within paths meant for static or shared content.
- Multiple different IP addresses accessing the same unique `/share/` URL (one victim, one attacker).
- **Detection Methods**: Monitor for `Cf-Cache-Status: HIT` on sensitive API paths or paths containing traversal characters.
## References
- [Harel Security Research Original Writeup](https://nokline.github.io/bugbounty/2024/02/04/ChatGPT-Cache-Deception.html)
- [Nagli’s Initial Discovery (Related Vulnerability)](https://www.shockwave.cloud/blog/shockwave-works-with-openai-to-fix-critical-chatgpt-vulnerability)