Full Report
Originally published at Arachne Digital.A Familiar Hype CycleArtificial-intelligence agents embedded in security information and event management (SIEM) platforms promise to automate investigation and triage. Some are claiming that AI will replace human analysts. Yet the effectiveness of any analytic model, machine-learning or rule-based, remains constrained by the quality of the telemetry it receives. If essential events or log fields are missing, even the most sophisticated model will incorrectly classify or overlook malicious activity.Threat-informed defence offers a rigorous, repeatable framework for determining exactly which logs, and which log fields, are required to detect the tactics, techniques, and procedures (TTPs) of cyber threat actors (CTAs) that actively target your industry and geography. Continuous cyber threat intelligence (CTI) keeps those requirements aligned with the evolving threat landscape.AI Agents Cannot Compensate for Missing TelemetryModern SIEMs can ingest billions of events per day, and AI agents excel at correlating and prioritising this volume. However, even the most advanced AI analytics cannot overcome fundamental telemetry gaps. An effective detection pipeline still depends on three conditions:The required event types must be collected.The specific fields needed for analysis must be present.The data must arrive clean, consistently formatted, and in near-real time.When any of these prerequisites fail, like critical command-line parameters are excluded to save storage, or endpoint logs arrive hours late, true positives become false negatives, alerts lack sufficient context, and investigations stall.On the flip side, attempting to “ingest everything” is neither sustainable nor cost-effective. Only a disciplined, intelligence-driven log-onboarding strategy ensures that AI is working with evidence strong enough to justify automated decisions.Threat-Informed Defence: A Four-Step MethodIdentify Relevant Adversaries: Use curated CTI to determine which actors are actively targeting organisations with your industry profile and geographic footprint.Enumerate Their TTPs: Map each adversary’s behaviours to MITRE ATT&CK techniques. This creates a formalised threat model grounded in evidence.Link Techniques to Detection Data Sources: ATT&CK provides data-source–to-technique mappings (Data Source IDs). Translate these into specific log sources. For example, DS0017: Command Execution maps to Windows Event ID 4688 plus parent-process correlation in EDR telemetry.Validate Log Coverage and Field Completeness: Build a matrix indicating whether each required source and field is present (green), partially present (yellow), or absent (red). The matrix becomes both a roadmap for engineering work and an audit artefact for regulators and executives.Once established, this process should be repeated on a defined cadence or when major technology changes occur.Case Study: Financial Institutions in South AmericaKey findings from Arachne Digital CTI for financial institutions across South America, taken 21 June 2025 included:Primary adversaries include: FIN7, TA549, Battery Elf, Gold Lagoon, and APT44High-frequency techniques include:T1059.001 PowerShellT1105 Ingress Tool TransferT1555.003 Credentials from Web BrowsersT1190 Exploit Public-Facing ApplicationT1005 Data from Local SystemUsing these techniques, the data-source requirements include:ATT&CK Technique: T1059.001 PowerShellEssential data sources include: Command and Process logsCritical data components include: Command Execution and Process CreationATT&CK Technique: T1105 Ingress Tool TransferEssential data sources include: File and Network Traffic logsCritical data components include: File Creation and Network Traffic ContentATT&CK Technique: T1555.003 Browser Credential TheftEssential data sources include: File and Process logsCritical data components include: File Access and OS API ExecutionATT&CK Technique: T1190 Exploit Public-Facing ApplicationEssential data sources include: Application Log and Network Traffic logsCritical data components include: Application Log Content and Network Traffic ContentATT&CK Technique: T1005 Data from Local SystemEssential data sources include: Process and Script logsCritical data components include: Process Creation and Script ExecutionOften, required fields are absent or inconsistently collected, primarily due to default configurations that suppressed “verbose” logging categories. If you are anyone working with a SOC, from a CISO right down to a tier one analyst, can you say that you know all the relevant CTAs to your organisation, their current TTPs, and that all the required logs are ingested into your SIEM with all the required fields? And do you have a way to ensure you stay up to date as the CTAs and TTPs shift?If you can’t, an AI agent won’t solve your fundamental issue.To prepare for deploying AI agents, maintain a configuration-management baseline that specifies the event ID, logging channel, and policy setting for each ATT&CK data component. Automate compliance checks via PowerShell, Ansible, or your preferred configuration-management tool.Cost-Efficiency: Log More Where It Matters, Less Where It Doesn’tStrengthening telemetry does not have to equal runaway storage bills. The same ATT&CK-aligned matrix that highlights missing data sources also exposes over-collected ones, logs and fields that contribute little or nothing to detections relevant to your threat model.For each log source:Tag the ATT&CK techniques it enables and assign a rough business value: high (critical detection gap), medium (useful enrichment), or low (no mapped techniques).Pull ingestion metrics from your SIEM or data-lake billing dashboard to calculate daily gigabytes and monthly cost.Create a simple 3×3 heat map (value on one axis, cost on the other). Anything “low value / high cost” is a candidate for optimisation.Based on your findings you can make a judgement to:Retain but Tier: Move low-value logs to chilled or object storage with longer query latency but a fraction of the price.Sample or Filter: Keep only events that include fields tied to medium or high ATT&CK value. For example, you could look at dropping firewall allows, but retaining denies.Shorten Retention: Regulatory requirements rarely mandate 365-day hot storage for every log type. Right-size retention based on compliance need plus investigative usefulness.Dollars freed by pruning low-value telemetry can bankroll onboarding of high-value sources, extended EDR fields, detailed SaaS audit logs, or container runtime events, without increasing the overall budget line.Also, track the before-and-after cost curve alongside detection coverage metrics. This evidence helps justify future security spend to finance and the board.Threat-informed defence is not just a security win; it’s a budget optimisation tool that ensures every gigabyte you keep is pulling its weight.Continuous Intelligence Keeps the Matrix CurrentThreat landscapes are dynamic:New or re-emerging groups (e.g., FIN6, applicable to our case study above) may adopt techniques that demand additional telemetry.Shifts in tooling (PowerShell downgraded, WMI upgraded) alter the priority of data sources.Emerging vulnerabilities introduce detection requirements for previously irrelevant platforms.Arachne Digital’s feeds deliver sector-specific intelligence as machine-readable JSON, including ATT&CK mappings, and first-/last-seen dates. Integrating this feed with your log-coverage matrix allows automatic creation of engineering tickets whenever a new technique enters the scope of relevant threats, or when there are possible cost savings to be made.By contrast, deploying AI on incomplete data often increases workload, as analysts chase poorly prioritised or context deficient alerts.Implementation RoadmapAcquire an Industry-Specific Intelligence BaselineFree introductory reports and API trials are available from Arachne Digital.Construct or Update the ATT&CK Log-Coverage MatrixInclude source, event ID, and critical fields. Mark gaps clearly.Remediate GapsPrioritise high-impact techniques and low-effort fixes.Align storage budgets with security value.Automate Continuous ValidationCombine configuration-management tools with CTI updates to keep the matrix evergreen.Deploy or Enhance AI AnalyticsOnce telemetry quality is verified, AI agents can work to their full potential.How Arachne Digital Accelerates the ProcessThread & Tracery: Automatically map threat-report text to ATT&CK techniques, providing machine-readable context suitable for log engineering workflows.Sector-Focused Intelligence Feeds: Deliver only the adversary activity relevant to your environment, reducing analysis overhead.Human Curated Accuracy: Experienced analysts validate each mapping, ensuring false data does not contaminate automated pipelines.Customers who adopt this threat-informed-defence methodology typically realise measurable gains within one quarter, including a reduction in false positives as redundant or missing telemetry is corrected, and faster incident triage due to richer context in each alert. Threat-informed-defence will also set your SOC up for success come audit time, through a maintained ATT&CK-aligned evidence trail.Are You Ready?AI agents offer genuine value in security operations, but they cannot transcend fundamental telemetry limitations. Threat-informed defence, anchored by current, high-fidelity CTI, remains the most efficient path to ensuring that the “right logs with the right fields” reach your SIEM. Only when that foundation is secure can AI reliably assume analytic tasks and allow your human teams to focus on higher order tasks.If you would like to review a complimentary, sector-specific ATT&CK coverage report, or to explore how Arachne Digital can integrate continuous intelligence directly into your log engineering workflows, contact us at [email protected] Logs for Smarter SOCs: Threat-Informed Telemetry That Powers AI Agents and Cuts Costs was originally published in MeetCyber on Medium, where people are continuing the conversation by highlighting and responding to this story.
Analysis Summary
# Best Practices: Threat-Informed Log Telemetry for Effective SOCs and AI Agents
## Overview
These practices focus on ensuring that Security Operations Centers (SOCs), particularly those utilizing AI agents within SIEM platforms, receive high-quality, relevant, and complete telemetry. The core principle is shifting from logging everything to adopting a **Threat-Informed Defense** strategy to maximize detection efficacy, reduce noise, and control costs.
## Key Recommendations
### Immediate Actions
1. **Assess Criticality of Current Logs:** Immediately review current log ingestion pipelines to identify fields that may have been intentionally excluded (e.g., command-line parameters) solely to reduce storage costs, as these omissions directly weaken detection capabilities.
2. **Establish Initial Threat Model:** Determine the key Cyber Threat Actors (CTAs) currently targeting your specific industry and geographic location using curated Cyber Threat Intelligence (CTI).
3. **Validate Real-Time Data Flow:** Verify that essential security logs are arriving in the SIEM in near real-time. Flag any sources consistently experiencing significant delays (hours late).
### Short-term Improvements (1-3 months)
1. **Map TTPs to Data Sources:** Systematically map the identified high-priority adversary Tactics, Techniques, and Procedures (TTPs) to specific MITRE ATT&CK techniques and their corresponding required data sources (e.g., Command Execution - DS0017, Process Creation - DS0009).
2. **Create a Coverage Matrix:** Develop a validation matrix that explicitly lists every required log source and its specific necessary fields against its current availability (Green: Present, Yellow: Partially Present, Red: Absent).
3. **Prioritize Log Engineering for Gaps:** Begin remediation work targeting "Red" (Absent) and "Yellow" (Partially Present) entries in the coverage matrix, focusing first on telemetry required to detect the most frequent and relevant TTPs.
4. **Enforce Data Quality Standards:** Ensure that required log data—once collected—is delivered consistently formatted, clean, and complete for AI agents to process reliably.
### Long-term Strategy (3+ months)
1. **Implement Continuous CTI Integration:** Establish a formal feedback loop to continuously update the threat model and associated telemetry requirements based on evolving CTI regarding emerging TTPs and CTA actions.
2. **Schedule Periodic Coverage Validation:** Embed the process of validating log coverage and field completeness into a defined, recurring cadence (e.g., quarterly or semi-annually) or trigger it immediately upon major infrastructure or technology changes.
3. **Optimize Ingestion based on Intelligence:** Refine the log onboarding strategy to be strictly intelligence-driven, justifying the ingestion of every log source by its capability to detect a known, relevant threat, thereby controlling costs effectively.
4. **Formalize Security Requirements Documentation:** Utilize the completed coverage matrix as a formal audit artifact for regulatory compliance and executive reporting on detection posture maturity.
## Implementation Guidance
### For Small Organizations
- **Focus Triage:** Concentrate initial Threat-Informed Defence efforts on the top 5 TTPs most likely to impact your sector, rather than attempting full ATT&CK coverage immediately.
- **Leverage Native Tools:** Ensure that built-in logging features present in existing security tools (like EDR) are fully enabled to gather contextual data (e.g., parent-process correlation for PowerShell execution).
### For Medium Organizations
- **Formalize Mapping Process:** Officially adopt the MITRE ATT&CK framework as the standard language for defining detection requirements between the CTI team and the SOC engineering team.
- **Automate Gap Detection (Internal):** Use existing SIEM/analytics capabilities to periodically audit ingested data against the required fields matrix to automatically generate engineering tickets for incomplete logs.
### For Large Enterprises
- **Establish Data Governance:** Implement formal data governance policies dictating field inclusion/exclusion rules, ensuring security telemetry integrity overrides general storage optimization efforts for critical sources.
- **Create Federated Intelligence Requirements:** Develop a mechanism to consume and translate continuous CTI feeds directly into structured telemetry requirements ingestion workflows, potentially integrating with specialized log engineering platforms.
## Configuration Examples
* **PowerShell Detection:** To meet the requirement for **T1059.001 PowerShell**, ensure collection of:
* **Command Execution Logs (DS0017):** e.g., Windows Event ID 4688, specifically capturing the full command line.
* **Process Creation Logs (DS0009):** Critical for parent-process correlation, enabling analysts to trace the execution chain leading to the PowerShell command.
* **Ingress Tool Transfer Detection (T1105):** This requires sufficient context from:
* **File Logs (DS0022):** For creation/modification of transferred files.
* **Network Traffic Logs (DS0024/DS0005):** To identify outbound connections to known malicious hosts or common cloud storage locations used for tool staging.
## Compliance Alignment
The adoption of Threat-Informed Defence practices strongly aligns with several industry standards by ensuring data quality and relevance:
* **NIST Cybersecurity Framework (CSF):** Directly supports the **Detect** function by ensuring comprehensive visibility necessary for identifying anomalies.
* **ISO/IEC 27002:** Supports requirements related to monitoring, reviewing, and managing information security controls (A.12 logging and monitoring).
* **CIS Critical Security Controls:** Aligns with controls focused on continuous monitoring and data collection integrity.
## Common Pitfalls to Avoid
1. **The "Ingest Everything" Trap:** Do not attempt to save costs by blindly excluding unknown/uncategorized log fields. This introduces "blind spots" that CTAs will exploit.
2. **Ignoring Contextual Fields:** Do not collect only the event ID; ensure contextual fields, such as command-line arguments or parent/child process relationships, are mandatory for high-fidelity detection.
3. **Stale Threat Models:** Relying on requirements derived from outdated threat intelligence leads to collecting irrelevant data while missing coverage for current TTPs.
4. **Treating Data Quality as Secondary:** Assuming AI agents can "clean up" messy or delayed logs. Poor data quality will result in false negatives, regardless of AI sophistication.
## Resources
- **MITRE ATT&CK Framework:** For mapping techniques to required data sources (e.g., specific Data Source IDs like DS0017).
- **Continuous CTI Sources:** Utilize external curated CTI feeds relevant to your organization's sector and geography to drive initial adversary identification.
- **SIEM Configuration Documentation:** Refer to specific vendor documentation for enabling detailed logging features (e.g., PowerShell Script Block Logging or enhanced process auditing).