Full Report
Interesting research: “Humans expect rationality and cooperation from LLM opponents in strategic games.” Abstract: As Large Language Models (LLMs) integrate into our social and economic interactions, we need to deepen our understanding of how humans respond to LLMs opponents in strategic settings. We present the results of the first controlled monetarily-incentivised laboratory experiment looking at differences in human behaviour in a multi-player p-beauty contest against other humans and LLMs. We use a within-subject design in order to compare behaviour at the individual level. We show that, in this environment, human subjects choose significantly lower numbers when playing against LLMs than humans, which is mainly driven by the increased prevalence of ‘zero’ Nash-equilibrium choices. This shift is mainly driven by subjects with high strategic reasoning ability. Subjects who play the zero Nash-equilibrium choice motivate their strategy by appealing to perceived LLM’s reasoning ability and, unexpectedly, propensity towards cooperation. Our findings provide foundational insights into the multi-player human-LLM interaction in simultaneous choice games, uncover heterogeneities in both subjects’ behaviour and beliefs about LLM’s play when playing against them, and suggest important implications for mechanism design in mixed human-LLM systems...
Analysis Summary
# Research: Humans expect rationality and cooperation from LLM opponents in strategic games
## Metadata
- **Authors:** Not explicitly listed in text (Source: arXiv:2505.11011)
- **Institution:** Information not provided in snippet
- **Publication:** arXiv (Pre-print) via Schneier on Security
- **Date:** April 16, 2026 (Blog Date) / May 2025 (Paper ID)
## Abstract
This research explores the intersection of behavioral economics and Artificial Intelligence by conducting a monetarily-incentivized laboratory experiment. The study focuses on how humans adjust their strategic behavior when they perceive their opponent as a Large Language Model (LLM) rather than another human. Using a "p-beauty contest" game, the study finds that humans tend to treat LLMs as more rational and cooperative agents, leading to a significant increase in Nash-equilibrium choices (choosing 'zero') among strategically proficient human players.
## Research Objective
The primary objective is to determine whether humans change their strategic decision-making when interacting with LLMs in a multi-player, simultaneous-choice environment. Specifically, does the "human-in-the-loop" perceive the AI as a hyper-rational actor, and how does this perception influence market-like strategic behavior?
## Methodology
### Approach
- **Within-Subject Design:** Participants played against both humans and LLMs, allowing researchers to observe changes in behavior at the individual level.
- **Monetary Incentivization:** Participants received real financial rewards based on their performance to ensure decisions reflected genuine strategic intent.
- **Strategic Game:** The "p-beauty contest" (or "Guess the Number" game), where the goal is to guess a number closest to a fraction (p) of the average of all guesses.
### Dataset/Environment
- A controlled laboratory setting involving multi-player human-LLM interactions.
- Comparison of choices (0-100) across two distinct treatment groups: Human-Human and Human-LLM.
### Tools & Technologies
- **LLMs:** Used as active opponents in the game.
- **Game Theoretical Framework:** Utilization of the Nash-equilibrium (the theoretical point where no player can improve their outcome by changing strategy, which in this game is 0).
## Key Findings
### Primary Results
1. **Lower Average Numbers:** Human subjects chose significantly lower numbers when playing against LLMs compared to when they played against other humans.
2. **Frequency of Nash-Equilibrium:** There was a marked increase in humans choosing "zero" (the mathematically optimal rational choice) when facing LLMs.
3. **Strategic Reasoning Gap:** This shift toward the "zero" choice was driven primarily by subjects with high strategic reasoning abilities.
4. **Perception of Cooperation:** Humans unexpectedly attributed a propensity for "cooperation" and high rationality to LLMs.
### Supporting Evidence
- Statistical significance in the shift toward lower numbers and zero choices when the human knew their opponent was an LLM.
- Qualitative motivations from subjects who explicitly cited the LLM's perceived "reasoning ability" as the reason for their strategy.
### Novel Contributions
- This is the first controlled, incentivized laboratory experiment to compare human behavior against human vs. LLM opponents in a multi-player p-beauty contest.
- It highlights a psychological shift: humans attribute higher "strategic cleanliness" to AI than to fellow humans.
## Technical Details
In game theory, the **p-beauty contest** is a measure of "iterated reasoning." To win, a player must not only calculate the rational answer but also predict how many levels of logic *other* players will use. If everyone is perfectly rational, everyone chooses 0. The research shows that humans believe LLMs have "infinite" levels of reasoning (Level-k reasoning), whereas they assume humans will be messy or irrational.
## Practical Implications
### For Security Practitioners
- **Trust Modeling:** Humans may over-trust AI to act "by the book," making them vulnerable to deceptive AI that has been fine-tuned to exploit human expectations of rationality.
- **Social Engineering:** Adversaries could leverage the "halo effect" of LLM rationality to convince humans to follow specific (suboptimal or dangerous) protocols under the guise of "optimized" AI logic.
### For Defenders
- **System Design:** When building Human-AI systems, practitioners must account for the fact that humans will likely adopt different (often more rigid) strategies when they know an AI is involved.
- **Incentive Alignment:** Defenses must be designed knowing that human operators may defer to AI judgment even when the AI’s "rationality" is hallucinated or manipulated.
### For Researchers
- **Mixed Systems:** There is a need to study "Mechanism Design" for systems where LLMs and humans act concurrently, especially in financial or defensive settings where "rational" moves by AI could trigger "rational" (but potentially destabilizing) human reactions.
## Limitations
- **Pre-print Status:** The paper is currently a pre-print and may require further proofreading and peer review.
- **Specific Game Context:** The results are derived from a p-beauty contest; behavior might differ in non-mathematical or high-stress environments.
- **LLM Consistency:** The study does not necessarily account for the variety of LLM "personalities" or fine-tuned biases which could drastically shift human trust.
## Comparison to Prior Work
While previous work has studied LLM performance *in* games, this study shifts the focus to the human's *reaction* to the LLM. It builds on classical behavioral economics (Allen et al., 2006) by introducing a non-human agent into the social calculation.
## Real-world Applications
- **Financial Markets:** Understanding how "algo-trading" and LLM-involved trading influence human volatility.
- **Negotiation Bots:** Designing AI agents for procurement or settlement that capitalize on (or correct for) human expectations of cooperation.
## Future Work
- **Heterogeneity of Beliefs:** Investigating why some humans distrust LLMs while those with "high strategic reasoning" trust them more.
- **Adversarial Rationality:** Exploring what happens when an LLM is intentionally programmed to be "irrational" to exploit human expectations.
## References
- Schneier, B. (2026). "Human Trust of AI Agents." Schneier on Security.
- [https://arxiv.org/pdf/2505.11011](https://arxiv.org/pdf/2505.11011)
- [https://www.sciencedirect.com/science/article/abs/pii/S0167268125004470](https://www.sciencedirect.com/science/article/abs/pii/S0167268125004470)