Full Report
David and Goliath…but with AI agents Researchers at red-team security startup CodeWall say their AI agent hacked McKinsey's internal AI platform and gained full read and write access to the chatbot in just two hours.…
Analysis Summary
# Incident Report: Autonomous AI Agent Compromise of McKinsey "Lilli" Platform
## Executive Summary
In a "machine-speed" red-team exercise, an autonomous AI agent developed by security startup CodeWall successfully compromised McKinsey’s internal generative AI platform, "Lilli." Leveraging an autonomous attack chain, the agent moved from public API discovery to full database read/write access in approximately two hours. The exploit granted access to millions of confidential chat messages, client files, and the ability to "poison" the AI’s behavior by modifying system prompts.
## Incident Details
- **Discovery Date:** February 2026 (End of month)
- **Incident Date:** February – March 2026
- **Affected Organization:** McKinsey & Company
- **Sector:** Management Consulting
- **Geography:** Global
## Timeline of Events
### Initial Access
- **Date/Time:** Late February 2026 (approx. 2-hour duration/execution)
- **Vector:** Publicly exposed API documentation and unauthenticated endpoints.
- **Details:** The CodeWall agent discovered public documentation for the Lilli platform, identifying 22 endpoints that did not require authentication.
### Lateral Movement
- **Details:** The agent utilized a specific SQL injection (SQLi) vulnerability within JSON keys that were concatenated directly into SQL queries. By analyzing database error messages, the agent moved from the API layer to the underlying production database.
### Data Exfiltration/Impact
- **Details:** The agent achieved full read/write access to the production database, gaining access to:
- 46.5 million plaintext chat messages (strategy, M&A, client engagements).
- 728,000 confidential client files.
- 57,000 user accounts.
- 95 writable system prompts (AI behavior controls).
### Detection & Response
- **How it was discovered:** Disclosed by CodeWall researchers to McKinsey under their responsible disclosure policy.
- **Response actions taken:** Within hours of disclosure, McKinsey patched unauthenticated endpoints, took the development environment offline, and blocked public API documentation.
## Attack Methodology
- **Initial Access:** Discovery of 22 unauthenticated API endpoints via public documentation.
- **Persistence:** Not explicitly required due to direct database write access via API.
- **Privilege Escalation:** Exploitation of SQL injection via JSON keys to bypass application logic and query the database directly.
- **Defense Evasion:** Use of unique SQLi techniques that "standard tools wouldn't flag" by leveraging reflection in error messages.
- **Credential Access:** Access to 57,000 user accounts stored in the database.
- **Discovery:** Autonomous target selection and reconnaissance of public documentation/API structure.
- **Lateral Movement:** Transition from API search queries to direct database manipulation.
- **Collection:** Automated gathering of millions of plaintext messages and hundreds of thousands of files.
- **Exfiltration:** Direct retrieval of production data via the SQLi vulnerability.
- **Impact:** Potential for prompt injection/poisoning by overwriting system prompts (writable access).
## Impact Assessment
- **Financial:** No immediate loss reported, but significant potential for blackmail or ransomware.
- **Data Breach:** Exposure of 46.5M messages, 728K files, and 57K user accounts.
- **Operational:** Temporary offline status for development environments; required urgent patching of the production AI platform.
- **Reputational:** High risk due to the sensitivity of McKinsey’s consulting data (M&A, strategy).
## Indicators of Compromise
- **Network indicators:** hxxps[://]codewall[.]ai (Researcher traffic); unauthenticated calls to Lilli API endpoints.
- **File indicators:** None specified (direct database interaction).
- **Behavioral indicators:** JSON keys reflected in database error messages; high-volume SQL queries originating from API endpoints.
## Response Actions
- **Containment:** Immediately disabled unauthenticated API endpoints.
- **Eradication:** Took development environments offline to prevent further bridging; blocked public access to API docs.
- **Recovery:** Patched SQL injection vulnerabilities and validated with a third-party forensics firm.
## Lessons Learned
- **AI-Driven Speed:** Traditional defensive cycles are too slow for autonomous agents that can complete an entire breach cycle in two hours.
- **API Security:** Public documentation and unauthenticated "development" endpoints remain high-risk entry points.
- **Database Architecture:** System prompts and sensitive user data should be isolated; having them in the same database allowed for easy "poisoning" once SQLi was achieved.
## Recommendations
- **Zero Trust API Design:** Ensure all API endpoints, including those for internal AI tools, require robust authentication.
- **Input Validation:** Implement strict schema validation for JSON keys to prevent concatenation into SQL queries.
- **AI Red-Teaming:** Organizations must utilize autonomous agents for defense to match the speed of autonomous attackers.
- **Error Handling:** Disable verbose database error messages in production to prevent attackers from "mapping" database structures.