Full Report
AI agents promise speed, but at what cost to trust? Dreadnode’s Wendiggensen & Palm unpack this dilemma through a hands-on study of leaked Russian data.
Analysis Summary
# Research: Auto-Poking The Bear: Analytical Tradecraft In The AI Age (LABScon25 Replay Summary)
## Metadata
- Authors: Martin Wendiggensen (Dreadnode/JHU AIST), Brad Palm (Dreadnode)
- Institution: Dreadnode, presented at SentinelOne (SentinelLABS)
- Publication: LABScon 2025 Presentation Replay Summary
- Date: October 9, 2025
## Abstract
This research explores the disruptive impact of Large Language Model (LLM)-driven agentic systems on established Cyber Threat Intelligence (CTI) analytical tradecraft. The authors investigate the inherent reliability and transparency costs introduced when analysts outsource data preparation, analysis, and workflow execution to AI assistants. They propose a framework for assessing these AI-assisted processes and communicating their limitations to maintain accountability in collaborative CTI research.
## Research Objective
To analyze how the integration of LLM-driven agentic systems into CTI workflows disrupts traditional collaborative research standards, and to develop a methodology for rigorously assessing the reliability, transparency, and limitations of AI-assisted analytical outputs.
## Methodology
### Approach
The research employed a comparative case study methodology centered around the deployment of an LLM-driven agentic system. This system was tasked with performing various analytical functions relevant to threat intelligence, ranging from simple data collation to complex analytical pipelines for adversary tracking.
### Dataset/Environment
The system was applied to analyze Russian internet content that had been leaked by Ukrainian cyber activists.
### Tools & Technologies
The core technology investigated was an **LLM-driven agentic system**, designed to automate parts of the CTI analysis lifecycle.
## Key Findings
### Primary Results
1. **Productivity Boost vs. Trust Deficit:** AI assistants significantly enhance productivity in CTI tasks, but their use introduces a critical dependency on the reliability of the underlying prompts and agentic workflow design.
2. **Erosion of Shared Understanding:** Reliance on opaque, AI-assisted methods threatens the collaborative foundation of CTI, as analysts must now question the methodological rigor underpinning a peer's AI-generated results.
3. **Need for New Assessment Standards:** The CTI community requires a new joint understanding and set of standards to evaluate the promises, pitfalls, and probabilities associated with integrating agentic systems into analytical pipelines.
### Supporting Evidence
The findings are illustrated via a detailed case study showcasing the architecture of their LLM-driven agentic system and its performance across different analytical tasks (data collation to complex tracking).
### Novel Contributions
- Development of a presentation/framework illustrating how to assess the strengths and limits of AI systems used in threat intelligence analysis.
- Emphasis on the critical requirement to **communicate judgments** regarding AI-assisted work clearly to peers and wider audiences to preserve accountability and transparency.
## Technical Details
The authors detailed the **architecture of their LLM-driven agentic system**. This system was iteratively tested across a spectrum of CTI tasks:
1. **Data Collation (Straightforward Tasks):** Basic retrieval and aggregation of information.
2. **Complex Analytical Pipelines:** Tasks requiring multi-step reasoning, such as tracking adversarial activities across disparate data sources.
The core technical challenge addressed was ensuring that the analysis produced by the agentic system maintained verifiable reliability equivalent to traditional tradecraft.
## Practical Implications
### For Security Practitioners
Practitioners must recognize that increased automation via AI does not inherently equate to increased confidence in the final intelligence product unless the process itself is validated. Skillsets must evolve to include prompt engineering, agentic workflow validation, and result auditing.
### For Defenders
The insights are crucial for understanding how adversary nation-states might also leverage similar agentic systems, potentially accelerating their intelligence gathering and operational planning cycles. Understanding AI's analytical strengths and weaknesses is key to anticipating advanced threat behaviors.
### For Researchers
Researchers must design validation frameworks that explicitly test the robustness of AI methodology. Future collaborative work should focus on developing shared evaluation standards for agentic systems in security research.
## Limitations
The summary implies that the presentation focuses heavily on the *application* and *communication* of AI results rather than a deep, adversarial robustness test (e.g., prompt injection against the analysis itself), though the need for reliability assessment suggests internal rigor.
## Comparison to Prior Work
This work builds upon established CTI standards (analytical tradecraft) but directly confronts the paradigm shift introduced by generative AI and agentic systems, which prior, non-AI-centric tradecraft standards may not adequately cover regarding provenance and systemic reliability.
## Real-world Applications
- **CTI Production:** Automating large-scale data ingestion and initial hypothesis generation while flagging results for human validation.
- **Cross-Organizational Collaboration:** Establishing a clear baseline for when and how to trust AI-generated data shared between collaborating intelligence teams.
## Future Work
- Defining concrete, measurable metrics for AI-assisted CTI assessment.
- Developing formal protocols for documenting agentic runs (e.g., "AI Artifact Reports") analogous to traditional analyst reports.
- Investigating the security implications of using external, commercial LLMs for sensitive intelligence analysis.
## References
- SentinelOne LABScon 2025 Conference materials.
- General foundational works on Cyber Threat Intelligence methodology and industry collaboration standards.
- Research concerning prompt engineering and agentic system architecture. (Specific citations are not provided in the source text).