Architecting Security for Agentic Capabilities in Chrome

Full Report

Agentic browsing appears to be the future of Chrome and other web browsers. Unlike other types of attacks, prompt injection is not something that can be fully "solved" in the traditional sense. This article details how the Chrome browser is attempting to prevent indirect prompt injection from hijacking the user's browser. After reviewing built-in protections from Gemini and other agent security principles, they are adding a new feature called user alignment critic and better origin isolation. The main planning model in Gemini uses page content in Chrome to determine the next action. Naturally, this is a great place for prompt injection because it may contain attacker-controlled content. They use spotlighting and train Gemini against attacks, but this still isn't enough. The user alignment critic is a separate model that evaluates the output of each action. Notably, it must serve the user's end goal. So, if the user is trying to view a store's address and the planning model attempts to initiate a bank transfer, that will obviously be rejected. The critic model is only allowed to see the metadata of the result and not have any unfiltered content. In practice, this makes the critic module immune to prompt injection. This helps prevent both goal hijacking and data exfiltration. The next protection is around site isolation. Agents can operate across websites, which violates this key principle. So, a prompt injection from site A could compromise site B. To address this, they are adding Agent Origin Sets, which limit the domains an Agent can access to those strictly required for the task. For each task, there is a gating function that is used to decide whether domains by the planner are relevant to the task or not. The design has two types: read-only origins and read-write origins. As with the alignment critic, the gating functions are not exposed to prompt-injection risks. Users can add origins as needed to complete the task as well. Part of the security belongs to the user. If you give a bot access to your bank and they steal your money, that's on you. The origins being used still need to be verified by the user. Some domains require explicit approval, such as banks and Google Password Manager, while others only require permission for the gating functions. On the reactive side, they have realtime scanning of pages to detect prompt injection attacks. There's an additional classifier that detects prompt injections and will reject the page if it's usable. They even have persistent red-team bots that try to derail the agentic browser. This article is great and echoes a great principle: design with security in mind. By having site isolation and the built-in critic alignment checker, derailing the Agent to perform malicious actions will be much harder. Great post!

Analysis Summary