Full Report
There is no 6 Nimmt! champion, but a $12 domain registration and one Wikipedia edit convinced several bots there was
Analysis Summary
# Tool/Technique: RAG Poisoning (Indirect Prompt Injection / Data Provenance Attack)
## Overview
This technique involves the manipulation of external data sources (websites, wikis, or databases) that are indexed by Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) systems. By planting fabricated or malicious information on authoritative-looking platforms, an attacker can influence the LLM’s output, causing it to present falsehoods as "verified" facts or trigger unintended actions in AI agents.
## Technical Details
- **Type:** Technique (Data Poisoning / Influence Operation)
- **Platform:** LLM RAG pipelines, AI Search Engines (e.g., GPT-4o with Search, Perplexity, Gemini), and autonomous AI Agents.
- **Capabilities:** Manipulation of model output, poisoning of future training corpora, and potential hijacking of AI agent tool-calling.
- **First Seen:** Publicized experiment conducted February 2025; published April 29, 2026.
## MITRE ATT&CK Mapping
- **[TA0001 - Initial Access]**
- **[T1566 - Phishing]** (Social engineering the model via trusted third-party sites)
- **[TA0007 - Discovery]**
- **[T1213 - Data from Information Repositories]** (Exploiting the model's reliance on external repositories like Wikipedia)
- **[TA0042 - Resource Development]**
- **[T1583.001 - Establish Accounts: Domain Registration]**
- **[Adversarial ML - Data Poisoning]** (Specifically targeting the retrieval layer)
## Functionality
### Core Capabilities
- **Retrieval Layer Manipulation:** Exploits the "Grounding" phase of RAG where the AI searches for high-ranking results. By using SEO tactics and established platforms (Wikipedia), the attacker ensures their content ranks highest for specific queries.
- **Corpus Poisoning:** Planted information remains active long enough to be scraped into the long-term training datasets (corpora) of future model iterations, making the misinformation permanent even if the original source is deleted.
- **Authority Mimicry:** Uses a combination of a newly registered domain and a Wikipedia citation to bypass the LLM's trust filters, as models currently struggle to verify the provenance or "age" of a source.
### Advanced Features
- **Agent Action Hijacking:** If an autonomous AI agent uses a poisoned source to decide on a course of action (e.g., "research this person and send them an email"), the attacker can specify the action the agent should perform.
- **Heuristic Bypass:** The technique exploits the lack of metadata verification (such as domain registration age vs. edit timestamp) in current AI data pipelines.
## Indicators of Compromise
- **File Names:** N/A (Web-based)
- **Network Indicators:**
- `6nimmt[.]com` (Defanged - malicious "official" source)
- `wikipedia[.]org` (Targeted platform for citation layering)
- **Behavioral Indicators:**
- LLM provides highly confident but uncorroborated answers regarding niche or new topics.
- AI responses citing sources with very recent domain registration dates (e.g., < 30 days).
- Sudden, unverified changes to low-traffic Wikipedia articles followed by immediate AI citations.
## Associated Threat Actors
- **Ron Stoner** (Security Researcher - Proof of Concept)
- **Anticipated:** Disinformation groups, SEO manipulators, and threat actors targeting AI-automated business workflows.
## Detection Methods
- **Provenance Analysis:** Checking the age and reputation of the primary source domain cited by the LLM.
- **Cross-Reference Heuristics:** Detecting "Circular Reporting" where a Wikipedia edit and a domain registration occur in close chronological proximity.
- **Discrepancy Checks:** Comparing RAG-sourced answers against static, older training data to identify sudden shifts in "factual" consensus.
## Mitigation Strategies
- **Data Provenance Filtering:** AI providers should incorporate domain age and "Authority Score" as weights in the retrieval ranking algorithm.
- **Heuristic Alarms:** Flagging results where a single citation points to a domain registered shortly before a corresponding Wikipedia update.
- **Human-in-the-loop:** Requiring verification for AI agents authorized to take high-stakes actions based on web-retrieved data.
- **Negative Caching:** Actively removing known poisoned entries from RAG indices once identified as misinformation.
## Related Tools/Techniques
- **SEO Poisoning:** Traditional search engine optimization manipulation.
- **Indirect Prompt Injection:** Hiding instructions in web pages to hijack LLM behavior.
- **Typosquatting:** Registering domains similar to legitimate brands to fool human and bot aggregators.