Full Report
A closer look at Wiz’s data classification engine — including our new Novel Classifiers
Analysis Summary
The provided article describes the data classification capabilities within the **Wiz security platform**, focusing on its classification methods rather than traditional malware or adversarial attack tools. Therefore, the summary will be structured around the Wiz classification feature.
# Tool/Technique: Wiz Data Classification Engine
## Overview
The Wiz Data Classification Engine is a feature within the Wiz cloud security platform designed to identify, classify, and manage sensitive data across multi-cloud and hybrid environments. Its purpose is to move security teams from simple data discovery to prioritization and automated protection by defining what is sensitive to the business.
## Technical Details
- Type: Tool/Framework Feature (Data Security Platform)
- Platform: Cloud-native environments, multi-cloud, DBaaS (Implied scope)
- Capabilities: Built-in classification, custom classification, AI/ML-driven Novel Classification, tiered scanning methods, context enrichment, actionable output integration.
- First Seen: Not explicitly detailed, but Novel Classifiers are listed as "New in Wiz" and in Preview.
## MITRE ATT&CK Mapping
*Note: As this is a defensive data security tool and not offensive malware, direct offensive mapping is not applicable. The context relates primarily to Defensive capabilities such as Resource Discovery (TA0005) and Data Identification/Protection.*
- **No Direct Offensive Mapping Applicable** (This is a defensive tool for data security and risk reduction)
## Functionality
### Core Capabilities
- **Built-in Classifiers:** Detects common sensitive data types (PII, financial data, credentials, healthcare identifiers) mapped to industry standards (GDPR, HIPAA, PCI DSS).
- **Custom Classifiers:** Allows users to define classification rules using keywords, patterns, or context-specific logic for proprietary data.
- **Tiered Classification Approach:** Balances speed and precision by using different methods based on data structure and scale:
- Metadata-only classification (for logs, predictable data).
- Sparse sampling (for structured, repetitive datasets like backups).
- Full content scanning (for unstructured and mixed-content files).
- **Actionable Output:** Integration with Data Findings View, Labels & Filters, and Ignore Rules for immediate remediation workflows.
### Advanced Features
- **Novel Classifiers (AI/ML Powered):** Uses unsupervised learning and large language models (LLMs) to interpret and label data clusters based on structural relationships and metadata context, discovering unique, proprietary patterns beyond static templates.
- **Context Enrichment:** Classification is grounded in context, leveraging dependency mapping, access metadata, and cloud-native usage signals to ensure accuracy and precision.
- **Continuous Learning:** The engine adapts based on user feedback (e.g., human validation of findings) to improve accuracy over time.
## Indicators of Compromise
*Note: As a defensive platform feature, there are no traditional Indicators of Compromise (IoCs) related to malware or threat activity.*
- File Hashes: N/A
- File Names: N/A
- Registry Keys: N/A
- Network Indicators: N/A
- Behavioral Indicators: N/A
## Associated Threat Actors
- N/A (This describes a commercial data security product feature.)
## Detection Methods
*Note: These reflect methods to verify the product/feature is functioning correctly, rather than detecting external attacks.*
- Signature-based detection: N/A
- Behavioral detection: N/A
- YARA rules if available: N/A
## Mitigation Strategies
*Note: These are strategies for implementing and optimizing the classification tool, not external threat mitigation.*
- Implement comprehensive configuration for Built-in Classifiers.
- Develop and deploy Custom Classifiers for business-specific proprietary data.
- Utilize Human Validation and feedback loops to refine the Engine’s accuracy, especially for Novel Classifiers.
- Configure Labels and Filters to enable policy-driven actions based on classification severity.
## Related Tools/Techniques
- Wiz Sensitive Data Discovery mechanism (Previous step in the process mentioned in the context).