Full Report
Agentic browsing appears to be the future of Chrome and other web browsers. Unlike other types of attacks, prompt injection is not something that can be fully "solved" in the traditional sense. This article details how the Chrome browser is attempting to prevent indirect prompt injection from hijacking the user's browser. After reviewing built-in protections from Gemini and other agent security principles, they are adding a new feature called user alignment critic and better origin isolation. The main planning model in Gemini uses page content in Chrome to determine the next action. Naturally, this is a great place for prompt injection because it may contain attacker-controlled content. They use spotlighting and train Gemini against attacks, but this still isn't enough. The user alignment critic is a separate model that evaluates the output of each action. Notably, it must serve the user's end goal. So, if the user is trying to view a store's address and the planning model attempts to initiate a bank transfer, that will obviously be rejected. The critic model is only allowed to see the metadata of the result and not have any unfiltered content. In practice, this makes the critic module immune to prompt injection. This helps prevent both goal hijacking and data exfiltration. The next protection is around site isolation. Agents can operate across websites, which violates this key principle. So, a prompt injection from site A could compromise site B. To address this, they are adding Agent Origin Sets, which limit the domains an Agent can access to those strictly required for the task. For each task, there is a gating function that is used to decide whether domains by the planner are relevant to the task or not. The design has two types: read-only origins and read-write origins. As with the alignment critic, the gating functions are not exposed to prompt-injection risks. Users can add origins as needed to complete the task as well. Part of the security belongs to the user. If you give a bot access to your bank and they steal your money, that's on you. The origins being used still need to be verified by the user. Some domains require explicit approval, such as banks and Google Password Manager, while others only require permission for the gating functions. On the reactive side, they have realtime scanning of pages to detect prompt injection attacks. There's an additional classifier that detects prompt injections and will reject the page if it's usable. They even have persistent red-team bots that try to derail the agentic browser. This article is great and echoes a great principle: design with security in mind. By having site isolation and the built-in critic alignment checker, derailing the Agent to perform malicious actions will be much harder. Great post!
Analysis Summary
# Best Practices: Architecting Security for Agentic Browsing
## Overview
As web browsers transition from static viewers to "agentic" platforms where AI (like Gemini) executes actions on behalf of users, a new attack surface emerges: **Indirect Prompt Injection**. These practices address the risk of malicious website content hijacking an AI agent’s planning process to perform unauthorized actions, such as data exfiltration or fraudulent transactions.
## Key Recommendations
### Immediate Actions
1. **Enable "Spotlighting" and Isolation:** If developing or deploying agentic tools, ensure the LLM can distinguish between system instructions and untrusted page content (input/output separation).
2. **User-in-the-Loop for Sensitive Origins:** Require manual, explicit approval before an AI agent can interact with high-value domains (e.g., banking portals, internal HR systems, or password managers).
3. **Implement Real-time Injection Scanning:** Deploy classifiers to scan active page content for known prompt injection patterns and derail the agent if a threat is detected.
### Short-term Improvements (1-3 months)
1. **Deploy a "User Alignment Critic":** Implement a secondary, isolated model that evaluates the *intent* of the AI’s proposed action against the user’s original goal before execution.
2. **Establish Agent Origin Sets (AOS):** Define a "least privilege" list of domains an agent is permitted to access for specific tasks.
3. **Define Read/Write Permissions:** Categorize allowed domains into "Read-Only" vs. "Read-Write" to prevent an agent from inadvertently submitting forms or deleting data on sensitive sites.
### Long-term Strategy (3+ months)
1. **Continuous Red-Teaming:** Deploy persistent, automated "red-team bots" designed to mimic attackers and attempt to derail agentic logic in a sandbox environment.
2. **Architectural Site Isolation:** Ensure the agentic framework maintains strict origin boundaries so that a compromise in "Site A" cannot programmatically move to "Site B" without a new gating evaluation.
## Implementation Guidance
### For Small Organizations
- **Focus on User Controls:** Rely on built-in browser protections and train staff to never "auto-approve" AI requests for sensitive site access.
- **Limit Agent Scope:** Use agentic features only for information gathering (Read-Only) rather than automated task execution.
### For Medium Organizations
- **Policy Configuration:** Use Enterprise Browser policies to whitelist specific "Agent Origin Sets" for corporate tasks.
- **Gating Functions:** Implement a manual review step for any AI action that involves transferring data between internal and external domains.
### For Large Enterprises
- **Custom Critic Models:** Develop a proprietary "User Alignment Critic" that understands company-specific workflows to flag business-logic deviations.
- **Automated Scanning:** Integrate the agentic browser logs into a Security Operations Center (SOC) to monitor for mass exfiltration attempts via AI-driven browsing.
## Configuration Examples
### Agent Origin Set (Logic Structure)
json
{
"Task": "Travel Booking",
"Allowed_Read_Write": ["company-travel-portal.com", "approved-airline.com"],
"Allowed_Read_Only": ["google.com/maps", "weather.com"],
"Gating_Criteria": "Action must align with 'Book flight' intent"
}
### Critic Model Metadata Filter
*To prevent the Critic from being injected, it should only see metadata:*
- **Input to Critic:** `Action: POST /transfer; Destination: Unknown_Site; User_Goal: View_News`
- **Logic:** `Goal (View_News) != Action (Transfer) -> REJECT`
## Compliance Alignment
- **NIST AI RMF:** Aligns with the "Govern" and "Protect" functions by managing risks of third-party content influence.
- **OWASP Top 10 for LLMs:** Directly addresses **LLM01: Prompt Injection** and **LLM02: Insecure Output Handling**.
- **CIS Controls:** Aligns with Control 9 (Browser Information Protection).
## Common Pitfalls to Avoid
- **Over-Reliance on a Single Model:** Never allow the same model that reads the page to also authorize the action.
- **Ignoring Data Exfiltration:** Attackers may not just "act" but simply "read" sensitive data and send it to an attacker-controlled origin via a legitimate-looking search query.
- **The "Auto-Pilot" Fallacy:** Assuming that because an AI is "smart," it can handle security decisions. Security decisions must remain aligned with user intent.
## Resources
- **Google Online Security Blog:** security[.]googleblog[.]com
- **OWASP LLM Security Project:** owasp[.]org/www-project-top-10-for-large-language-model-applications/
- **NIST Artificial Intelligence Risk Management Framework:** nist[.]gov/itl/ai-rmf