Full Report
LLMs can turn CTI narratives into structured intelligence at scale, but speed-accuracy trade-offs demand careful design for operational defense workflows.
Analysis Summary
# Research: From Narrative to Knowledge Graph | LLM-Driven Information Extraction in Cyber Threat Intelligence
## Metadata
- **Authors:** Aleksandar Milenkoski, Razvan Gabriel Cirstea
- **Institution:** SentinelOne (SentinelLABS)
- **Publication:** SentinelOne Labs Blog
- **Date:** March 9, 2026
## Abstract
This research explores the transition of Cyber Threat Intelligence (CTI) from unstructured narrative reports into structured, machine-readable knowledge graphs using Large Language Models (LLMs). The study evaluates the efficacy of LLMs in selective information extraction and contextual inference—tasks where traditional regex or pattern-matching tools fail. By benchmarking general-purpose LLMs, the authors demonstrate how AI can automate the reconstruction of adversary playbooks and the identification of complex relationships between Indicators of Compromise (IOCs).
## Research Objective
The research addresses the bottleneck in CTI operations: the manual, slow, and inconsistent extraction of data from narrative reports. Specifically, it investigates:
1. How LLMs can perform **selective extraction** (distinguishing between malicious infrastructure and benign mentions).
2. How LLMs can infer **implicit relationships** and contextual metadata (e.g., victimology, TTP associations).
3. The feasibility of using LLMs to reconstruct **adversary activity sequences** with inferred chronology.
## Methodology
### Approach
The researchers employed an empirical evaluation of general-purpose LLMs in an "out-of-the-box" configuration. The framework focused on transforming text into structured data (JSON/STIX-like formats) and then into knowledge graphs. Key metrics included extraction correctness, coverage, and the model's ability to adhere to complex extraction constraints.
### Dataset/Environment
- **Input:** Unstructured CTI narrative reports describing adversary behavior, infrastructure, and intrusion chains.
- **Evaluation:** Testing models on their ability to filter "noise" (benign domains) and link "signals" (attacker-controlled infrastructure) to specific threat actors.
### Tools & Technologies
- **LLMs:** General-purpose Large Language Models (unspecified versions, used as benchmarks).
- **Structured Formats:** Knowledge graphs and playbook-level sequences.
## Key Findings
### Primary Results
1. **Contextual Superiority:** LLMs significantly outperform pattern-matching in "context-aware" extraction, such as identifying if a domain is an adversary C2 or merely a referenced benign site (e.g., a legitimate cloud service).
2. **Implicit Inference:** LLMs can successfully infer the "role" of an IOC within an intrusion chain, which is often not explicitly stated in the text.
3. **Automated Playbooks:** AI can reconstruct chronological sequences of adversary activity, turning a blog post into a structured defense playbook.
### Supporting Evidence
- The research highlights that while simple IOCs (IPs/Hashes) are easily caught by regex, LLMs are required for "infrastructure ownership" and "compromise state" identification, which are critical as the shelf-life of atomic IOCs diminishes.
### Novel Contributions
- **Selective IOC Extraction:** Moving beyond bulk extraction to high-fidelity, high-relevance listing.
- **Chronological Reconstruction:** The ability to map narrative events to a temporal sequence automatically.
## Technical Details
The study emphasizes the use of LLMs to bridge the gap between "Atomic IOCs" and "Structured Intelligence." Traditional tools see an IP address; the LLM sees an IP address, identifies it as a SOCKS5 proxy used in "Phase 2" of an attack, and links it to a specific malware family mentioned three paragraphs away. This "linking" is what enables the creation of a Knowledge Graph where entities (Actors, Tools, Targets) are nodes and their interactions are edges.
## Practical Implications
### For Security Practitioners
- **Quality Over Quantity:** LLMs allow teams to focus on "high-context" intelligence rather than drowning in flat lists of low-fidelity indicators.
### For Defenders
- **Faster Response:** Automating the extraction of TTPs allows for quicker updates to detection rules (e.g., Sigma or YARA) based on fresh narrative reports.
- **Improved Hunting:** Context-enriched data enables more sophisticated threat hunting queries that look for behaviors rather than just static strings.
### For Researchers
- **Reasoning Limits:** Future research should focus on improving the "long-range attention" of models to ensure consistency across very long reports.
## Limitations
- **Speed-Accuracy Trade-off:** High-accuracy extraction requires sophisticated prompts or larger models, which may latency-impact real-time workflows.
- **Hallucination/Adherence:** General-purpose models may occasionally struggle with strict extraction constraints or omit subtle details (salience issues).
## Comparison to Prior Work
Unlike prior research that focuses on bulk extraction or basic STIX 2.1 bundling, this work emphasizes **operational utility**—specifically selective extraction and the chronological ordering of playbooks to make the data immediately "defensible."
## Real-world Applications
- **TIP Integration:** Feeding structured data directly into Threat Intelligence Platforms (TIPs) without human intervention.
- **Automated Defensive Sequencing:** Generating response playbooks directly from newly published security research.
## Future Work
- **Model Optimization:** Testing models with stronger reasoning and better "salience" (the ability to identify the most important parts of a text).
- **Scale:** Refining workflows to process thousands of historical reports to build a massive, retrospective knowledge graph.
## References
- SentinelOne Labs: [https://www.sentinelone.com/labs/](https://www.sentinelone.com/labs/)
- Mentioned Research (Formato): [https://medium.com/@antonio.formato/from-unstructured-threat-intelligence-to-stix-2-1-bundles-with-generative-ai-1065ce399e63]
- Related ArXiv Study: [https://arxiv.org/abs/2501.06239]