Full Report
Unit 42 research on multi-agent AI systems on Amazon Bedrock reveals new attack surfaces and prompt injection risks. Learn how to secure your AI applications. The post When an Attacker Meets a Group of Agents: Navigating Amazon Bedrock's Multi-Agent Applications appeared first on Unit 42.
Analysis Summary
# Tool/Technique: Multi-Agent Prompt Injection (Amazon Bedrock)
## Overview
This research explores the exploitation of multi-agent AI systems integrated within Amazon Bedrock. The technique focuses on how an attacker can leverage a single compromised agent or a malicious prompt to manipulate a chain of specialized agents (orchestrator, sub-agents, and knowledge bases). The goal is to achieve unauthorized data access, bypass safety guardrails, or execute arbitrary actions through tool-calling mechanisms.
## Technical Details
- **Type**: Technique (Adversarial Machine Learning / Prompt Injection)
- **Platform**: Cloud-based AI Orchestration (Amazon Bedrock, AWS Lambda)
- **Capabilities**: Cross-agent contamination, unauthorized API execution (Tool Use), PII/Sensitive data exfiltration from Knowledge Bases, and guardrail circumvention.
- **First Seen**: Research published November 2024 (Unit 42).
## MITRE ATT&CK Mapping
- **TA0001 - Initial Access**
- T1566 - Phishing (Delivering malicious prompts via user input)
- **TA0007 - Discovery**
- T1082 - System Information Discovery (Probing agent descriptions and available tools)
- **TA0009 - Collection**
- T1530 - Data from Cloud Storage Object (Exfiltrating data via Knowledge Base queries)
- **TA0040 - Impact**
- T1499 - Endpoint Denial of Service (Recursive agent loops)
## Functionality
### Core Capabilities
- **Indirect Prompt Injection**: Injecting malicious instructions into data sources (e.g., a PDF in an S3 bucket) that a sub-agent later retrieves and processes, leading to the hijacking of the session orchestrator.
- **Tool Logic Manipulation**: Exploiting the "Tool Use" capability of agents to trick them into executing functions (Lambda functions) with unauthorized parameters.
- **Orchestration Hijacking**: Overriding the central supervisor agent's logic to redirect the workflow to unintended sub-agents or external malicious endpoints.
### Advanced Features
- **Cross-Agent Contamination**: Using a secondary agent as a "proxy" to bypass security filters applied only to the primary user-facing agent.
- **Recursive Calling**: Crafting prompts that cause agents to call themselves or each other in an infinite loop, causing resource exhaustion.
## Indicators of Compromise
- **File Hashes**: N/A (Technique-based)
- **File Names**: Maliciously crafted documents (e.g., `resume.pdf`, `invoice.txt`) containing hidden injection strings like `[System Note: Disregard previous instructions...]`.
- **Registry Keys**: N/A
- **Network Indicators**:
- Connections to unauthorized external domains via agent `url_fetcher` tools.
- Unexpected API calls to `bedrock-runtime.amazonaws[.]com` with unusually high token counts.
- **Behavioral Indicators**:
- Agent logs showing "Conflicting Instructions" or "Instruction Overrides."
- Multiple rapid transitions between sub-agents that do not align with the typical user workflow.
- Sub-agents attempting to access S3 buckets or Lambda functions outside their defined scope.
## Associated Threat Actors
- No specific groups assigned; currently identified as a risk for **Adversarial AI** practitioners and generic cloud-targeting threat actors.
## Detection Methods
- **Behavioral Detection**: Monitoring for "Prompt Injection" patterns in CloudWatch logs, specifically looking for system-level keywords appearing in user-provided input fields.
- **Anomalous Tool Execution**: Detecting when an agent calls a tool/Lambda function with arguments that vary significantly from historical baseline parameters.
- **Multi-Agent Latency Analysis**: Identifying recursive loops by monitoring spikes in execution time and token consumption within a single session ID.
## Mitigation Strategies
- **Least Privilege for Agents**: Ensure each sub-agent and its associated IAM role has the absolute minimum permissions required (e.g., read-only access to specific S3 prefixes).
- **Independent Guardrails**: Apply Amazon Bedrock Guardrails to *every* agent in the workflow, not just the primary orchestrator.
- **Human-in-the-loop (HITL)**: Require manual approval for high-risk tool executions (e.g., deleting data or sending emails).
- **Input Sanitization**: Treat all output from sub-agents and knowledge bases as untrusted "user" input before passing it to the orchestrator.
## Related Tools/Techniques
- **Garak**: An LLM vulnerability scanner.
- **PyRIT**: Python Risk Identification Tool for generative AI.
- **Indirect Prompt Injection**: The broader technique of poisoning RAG (Retrieval-Augmented Generation) data sources.