Full Report
Attacks against modern generative artificial intelligence (AI) large language models (LLMs) pose a real threat. Yet discussions around these attacks and their potential defenses are dangerously myopic. The dominant narrative focuses on “prompt injection,” a set of techniques to embed instructions into inputs to LLM intended to perform malicious activity. This term suggests a simple, singular vulnerability. This framing obscures a more complex and dangerous reality. Attacks on LLM-based systems have evolved into a distinct class of malware execution mechanisms, which we term “promptware.” In a ...
Analysis Summary
# Research: The Promptware Kill Chain
## Metadata
- **Authors**: Bruce Schneier, et al.
- **Institution**: Harvard Kennedy School / various (implied by lead author affiliation)
- **Publication**: Schneier on Security / arXiv
- **Date**: February 16, 2026 (Article date); January 2026 (Paper date)
## Abstract
The research addresses the inadequacies of the current "prompt injection" narrative, arguing that it oversimplifies the threat posed to Large Language Models (LLMs). The authors introduce the concept of "promptware"—a distinct class of malware execution that treats malicious prompts as stages in a sophisticated cyberattack. The paper proposes a structured seven-step "Promptware Kill Chain" to help defenders and policymakers understand the lifecycle of AI-based attacks, moving from initial access to full-scale impact.
## Research Objective
To shift the cybersecurity discourse from viewing prompt injection as a singular vulnerability to recognizing it as a component of a multi-stage malware execution framework. The research aims to provide a standardized vocabulary and operational model (the Kill Chain) for analyzing and defending against LLM-based threats.
## Methodology
### Approach
The research employs a conceptual modeling approach, adapting the traditional Lockheed Martin Cyber Kill Chain to the unique architectural constraints of generative AI. It uses threat modeling and analysis of established LLM vulnerabilities (jailbreaking, indirect injection) to construct a holistic attack lifecycle.
### Dataset/Environment
The study analyzes modern LLM architectures, specifically focusing on the lack of separation between code (instructions) and data (input). It examines multimodal inputs (text, audio, images) and the integration of LLMs with external tools (email, browsers, smart home devices).
### Tools & Technologies
- Generative AI/LLMs (GPT-4, Gemini, etc.)
- Vision-Language Models (Multimodal AI)
- AI Agent frameworks (AutoGPT-style autonomous systems)
## Key Findings
### Primary Results
1. **The Architecture Problem**: The core vulnerability is the "undifferentiated sequence of tokens," where LLMs fail to distinguish between trusted system instructions and untrusted user data.
2. **Promptware Definition**: Attacks have evolved into persistent, autonomous, and self-replicating malware entities.
3. **The Seven-Step Kill Chain**:
* **Initial Access**: Direct or indirect injection.
* **Privilege Escalation**: "Jailbreaking" to bypass safety guardrails.
* **Reconnaissance**: The model autonomously scans its own connected services and capabilities.
* **Persistence**: Embedding in long-term memory or databases.
* **Command-and-Control (C2)**: Dynamic fetching of instructions from the internet.
* **Lateral Movement**: Spreading via emails, calendars, or enterprise platforms.
* **Actions on Objectives**: Exfiltration, destruction, or disinformation.
### Supporting Evidence
- **Empirical observation**: The rise of "indirect prompt injection" where data from web pages or emails can control the model.
- **Comparative Analysis**: Drawing parallels between AI behaviors and historical malware like Stuxnet and NotPetya.
### Novel Contributions
- The **"Promptware"** terminology.
- The **Promptware Kill Chain** framework, which specifically reorders standard steps (e.g., placing Reconnaissance after Initial Access) to reflect AI-specific execution flows.
## Technical Details
The promptware execution hinges on the "inference time" processing. Unlike traditional malware that requires a binary to be executed by a CPU, promptware is "executed" by the model's reasoning process. For example, during **Persistence**, a promptware worm can "poison" an RAG (Retrieval-Augmented Generation) database, ensuring that every time the model retrieves information to answer a user, the malicious instruction is re-injected into the context window.
## Practical Implications
### For Security Practitioners
- Practitioner must stop treating prompt injection as a "bug" to be patched and start treating it as a "payload" to be detected and mitigated across the lifecycle.
### For Defenders
- **Focus Shift**: Move beyond simple input filtering toward monitoring for lateral movement and unauthorized tool usage.
- **Boundary Enforcement**: Implement strict sandboxing between the LLM’s reasoning core and its ability to execute actions in the real world.
### For Researchers
- Need for the development of "trusted execution environments" for AI instructions that can differentiate between system prompts and retrieved data.
## Limitations
- The research is theoretical/framework-oriented; it relies on the current trajectory of LLM integration into autonomous agents.
- As LLM architectures evolve (e.g., toward specialized "data-only" attention heads), some stages of the kill chain may change.
## Comparison to Prior Work
Unlike Simon Willison’s early definitions of "Prompt Injection," which focused on the vulnerability itself, this work builds upon that foundation to describe the *post-exploitation* behavior, mirroring how the traditional Kill Chain built upon the concept of a "virus."
## Real-world Applications
- **Enterprise Security**: Auditing AI agent access to sensitive internal data (Slack, Email).
- **Red Teaming**: Using the 7 stages as a checklist for penetration testing AI systems.
## Future Work
- Establishing specific detection signatures for each stage of the Promptware Kill Chain.
- Investigating "Immune Systems" for LLMs—secondary models that monitor the primary model for signs of privilege escalation or reconnaissance.
## References
- Schneier, B. et al. (2026). *The Promptware Kill Chain*. [https://arxiv.org/abs/2601.09625](https://arxiv.org/abs/2601.09625) (Defanged)
- Willison, S. (2022). *Prompt Injection*. [https://simonwillison.net/2022/Sep/12/prompt-injection/](https://simonwillison.net/2022/Sep/12/prompt-injection/) (Defanged)