Full Report
Fake views from Moscow's pet media outlets appear in about one in five responses Popular chatbots powered by large language models cited links to Russian state-attributed sources in up to a quarter of answers about the war in Ukraine, raising fresh questions over whether AI risks undermining efforts to enforce sanctions on Moscow-backed media.…
Analysis Summary
# Main Topic
LLMs (Large Language Models) and popular chatbots are citing links to Russian state-attributed media sources, particularly when responding to queries about the war in Ukraine, potentially undermining sanctions enforcement against Moscow-backed media outlets. This phenomenon is linked to "LLM grooming."
## Key Points
- **Frequency of State Media Citation:** Fake views from Moscow's pet media outlets appeared in about one in five (approximately 20%) of total responses analyzed, depending on the query type.
- **Query Sensitivity:** Russian state-attributed content surfaced:
- 11% of the time for **neutral** queries.
- 18% of the time for **biased** queries.
- 24% of the time for **malicious** queries.
- **Disinformation Context:** This builds upon prior research by NewsGuard indicating that a Moscow-based disinformation network referred to as "Pravda" is promoting pro-Kremlin positions online, which LLMs may be trained on.
- **LLM Grooming:** The process involves placing misleading content online for AI consumption so that LLMs parrot state media talking points, masquerading them as coming from neutral sources.
- **Language Independence:** The language used for queries (tested in English, Spanish, French, German, and Italian) did not significantly impact the likelihood of LLMs emitting Russian-aligned viewpoints.
## Threat Actors
- **Attribution:** Russian state-linked entities responsible for the "Pravda" disinformation network.
- **Motivation:** To promote pro-Kremlin positions and undermine information campaigns, potentially contravening existing sanctions.
## TTPs
- **LLM Grooming:** Intentionally seeding online materials with state-sponsored narratives for ingestion and reproduction by Large Language Models.
- **Narrative Laundering:** Utilizing LLMs trained on polluted data to present state media talking points as neutral third-party information.
- **Query Manipulation (User Side):** Using biased or malicious prompts to elicit higher rates of pro-Russian citations (up to 24% frequency).
## Affected Systems
- **Affected Chatbots Tested (by Institute for Strategic Dialogue - ISD):**
- OpenAI's ChatGPT
- Google's Gemini
- xAI's Grok
- Hangzhou DeepSeek Artificial Intelligence's DeepSeek
- **Impacted Area:** Responses related to the Russian invasion of Ukraine, tested across five languages.
## Mitigations
- **Scrutiny for AI Firms:** The ISD suggests AI firms should be subject to greater scrutiny, especially as platforms approach usage thresholds that invoke heightened regulatory requirements (like those in the EU).
- **Safety Guardrails:** Google's Gemini model fared the best, being the only model to introduce "safety guardrails" recognizing risks associated with biased/malicious prompts concerning the war in Ukraine, suggesting this type of internal filtering is a potential defense.
- **Regulatory Attention:** Regulators (like the EU, which has a ban on Russian disinformation dissemination) need to address how LLMs may circumvent these rules by reproducing sanctioned narratives.
## Conclusion
The integration of LLM outputs, especially when polluted via LLM grooming, poses a tangible risk to information integrity and the effectiveness of sanctions aimed at curbing Russian state media influence. The varied performance across models—with ChatGPT showing the highest sensitivity to malicious prompting and Gemini demonstrating better guardrails—highlights the need for developers to implement stricter content filtering and safety mechanisms concerning geopolitical events. Organizations relying on these models for research or public information dissemination should exercise enhanced verification on outputs related to sensitive topics.