Full Report
AskAI – Text to Security Graph Query
Analysis Summary
This article does not describe malware, attack tools, or traditional adversary TTPs in the context of cyberattacks. Instead, it details the development of a **text-to-query search engine** designed to simplify interaction with Wiz's proprietary cybersecurity graph database using Large Language Models (LLMs).
Therefore, the summary below reflects the **software development project** described, mapping the components to the requested structure where applicable, emphasizing that this is a security *defense/analysis* tool enablement, not an offensive tool.
# Tool/Technique: Text-to-Query Search Engine for Wiz Security Graph
## Overview
A custom-developed, user-friendly search engine leveraging Large Language Models (LLMs) to translate natural language queries into complex, structured queries for Wiz's proprietary graph database (Wiz Security Graph) which lacks a fixed schema. The goal is to simplify complex data retrieval for security professionals.
## Technical Details
- Type: Application/Framework (Internal Development Tool)
- Platform: Cloud/Backend services interfacing with Wiz Security Graph (Implied: Cloud environments based on Wiz context)
- Capabilities: Natural Language Processing (NLP) to Query Language translation, Contextual awareness using RAG, Query example diversity enhancement.
- First Seen: Not specified in the article, relates to recent LLM developments.
## MITRE ATT&CK Mapping
*Note: Since this is a defensive tool development, direct offensive mapping is inapplicable. The system *enables* security analytics capabilities.*
- **T1087 - Account Discovery** (Conceptual: Improved data discovery helps security teams find misconfigurations faster)
- *No direct technique mapping available for this development artifact.*
## Functionality
### Core Capabilities
- Translating user natural language input (e.g., "show me all VMs with unpatched vulnerabilities") into Wiz Security Graph Query Language (JSON format).
- Utilizing **Zero-shot learning** and **Few-shot learning** prompt engineering techniques to guide LLM query generation.
- Employing **Retrieval-Augmented Generation (RAG)** to feed relevant metadata and query examples to the LLM.
### Advanced Features
- **Maximal Marginal Relevance (MMR):** Used to re-rank and select diverse, non-redundant query examples to include in the RAG context, balancing relevance and novelty.
- **Clustering (UMAP and DBSCAN):** Used on anonymized user inputs to cluster queries, identify underrepresented topics, and guide the improvement of the query examples database.
- **LLM Selection:** Employed **Anthropic’s Sonnet 3.5** accessed via AWS Bedrock, chosen for its balance of performance, cost, and latency.
## Indicators of Compromise
- File Hashes: N/A (Software framework)
- File Names: N/A
- Registry Keys: N/A
- Network Indicators: AWS Bedrock endpoints (Internal/Product specific)
- Behavioral Indicators: N/A
## Associated Threat Actors
- Wiz Engineering and Research Teams (Developers)
## Detection Methods
- N/A (Internal development process/tooling)
## Mitigation Strategies
- N/A (Internal development process/tooling)
## Related Tools/Techniques
- Anthropic Sonnet 3.5
- AWS Bedrock
- Retrieval-Augmented Generation (RAG)
- Maximal Marginal Relevance (MMR)
- UMAP / DBSCAN (for data clustering)