Full Report
Patterson Cake // PART 1 PART 2 In part one of “Wrangling the M365 UAL,” we talked about acquiring, parsing, and querying UAL data using PowerShell and SOF-ELK. In part […] The post Wrangling the M365 UAL with SOF-ELK and CSV Data (Part 3 of 3) appeared first on Black Hills Information Security, Inc..
Analysis Summary
# Tool/Technique: SOF-ELK (with M365 UAL Data Ingestion)
## Overview
This summary details the process of ingesting and analyzing Microsoft 365 Unified Audit Log (UAL) data using the SOF-ELK stack, particularly focusing on techniques to reformat exported CSV data (containing the "AuditData" blob) into a format ingestible by SOF-ELK for efficient querying and investigation.
## Technical Details
- Type: Framework/Tool (SOF-ELK), Technique (Data Reformatting/Ingestion)
- Platform: Linux (for data wrangling using `csvtool`, `sed`, `scp`), Target Data Source: Microsoft 365 UAL
- Capabilities: Ingesting raw UAL CSV exports, reformatting the "AuditData" blob into JSON, automatic parsing in SOF-ELK, geospatial enrichment, and platform updating.
- First Seen: Context implies ongoing methodology used in investigations published after prior parts of the series (Parts 1 and 2 referenced).
## MITRE ATT&CK Mapping
This is primarily an analytical and preparatory step, mapping to a technique related to Data Preparation and Defense Evasion depending on the context of data handling.
- **TA0008 - Collection** (Applicable if treating the data extraction/reformatting as a necessary precursor to evidence review)
- T1005 - Data from Local System (If data is collected locally before processing)
## Functionality
### Core Capabilities
- **CSV Parsing:** Utilizing `csvtool` to reliably extract specified columns (specifically column 6, the "AuditData" blob) from Microsoft 365 UAL CSV exports, even when the column data contains internal commas.
- **Data Cleaning:** Removing standard CSV quoting characters (double-quotes) from the extracted blob data.
- **Format Conversion:** Transforming the extracted, cleaned data into a JSON format suitable for ingestion by Logstash within the SOF-ELK stack.
- **Ingestion:** Securely transferring the reformatted JSON file to the SOF-ELK ingestion directory via `scp`.
- **Verification:** Checking Elasticsearch indices via command line (`sof-elk_clear.py -i list`) or the SOF-ELK web UI to confirm successful parsing.
### Advanced Features
- **Geolocation Enrichment:** Using MaxMind GeoLite2 data, updated via the `geoip_bootstrap.sh` script, to enrich log entries with location information.
- **Platform Maintenance:** Updating the SOF-ELK installation using the `sof-elk_update.sh` script.
- **Extensibility:** SOF-ELK supports ingestion of various log types beyond M365 audit data (`aws, azure, gcp, etc.`).
## Indicators of Compromise
*This section focuses on the tools used for processing the artifacts, not the malware itself.*
- File Hashes: N/A (Tools are standard Linux utilities or provided scripts)
- File Names: `your-csv-ual-data.csv`, `pc-purview-export.csv`, `pc-purview-audit-data.csv`, `pc-purview-audit-data.json`
- Registry Keys: N/A
- Network Indicators: `[email protected]` (Example SOF-ELK system IP)
- Behavioral Indicators: Execution of `csvtool`, `sed`, `scp`, and internal SOF-ELK scripts (`geoip_bootstrap.sh`, `sof-elk_update.sh`).
## Associated Threat Actors
This technique is associated with security operations, forensics analysts, and penetration testers utilizing the SOF-ELK stack for log analysis, particularly when dealing with M365 environments. No specific threat actor group is mentioned as using this *data processing* methodology maliciously.
## Detection Methods
- **Signature-based detection:** Based on known behaviors of the tooling (e.g., invocation of `csvtool` or specific script executions on forensic platforms).
- **Behavioral detection:** Monitoring unusual high-volume data manipulation or transfer of structured compliance data.
- **YARA rules:** N/A
## Mitigation Strategies
- **Prevention measures:** For M365 exports, leverage PowerShell or API methods when possible to acquire data in a structured format (like JSON) that circumvents the need for extensive CSV wrangling.
- **Hardening recommendations:** Restrict access to systems running SOF-ELK analysis infrastructure and ensure access controls are strict on the source M365 UAL exports.
## Related Tools/Techniques
- PowerShell (Mentioned in Part 1 for initial data acquisition)
- SOF-ELK (The analysis platform)
- AWS EC2 (Used in Part 2 for flexible SOF-ELK deployment)
- `csvtool` (Utility used for column extraction)
- MaxMind GeoLite2 (Used for geolocation enrichment)