Full Report
Three seconds of audio is all it takes to clone a voice for fraud. Adaptive Security shows how deepfake calls trick employees into sending real money—and why most defenses don't catch them. [...]
Analysis Summary
# Tool/Technique: AI-Generated Deepfake Voice & Video Impersonation
## Overview
This technique involves the use of artificial intelligence to create high-fidelity synthetic media (audio and video) that clones the identity of a trusted individual. In a corporate environment, this is primarily used for **Business Email Compromise (BEC) 3.0** or "Executive Impersonation" to deceive employees into authorizing fraudulent financial transfers or divulging sensitive credentials.
## Technical Details
- **Type:** Technique / Social Engineering Enhancement
- **Platform:** Multimedia communication channels (Zoom, Microsoft Teams, PSTN/Phone, WhatsApp)
- **Capabilities:**
- Real-time voice cloning from as little as 3 seconds of source audio.
- Synchronized video deepfakes (face-swapping/lip-syncing) for live meetings.
- Offline, moderation-free model execution on consumer-grade hardware.
- **First Seen:** Incidents spiked significantly in 2024; high-profile $25.6M Arup heist reported early 2024.
## MITRE ATT&CK Mapping
- **[TA0001 - Initial Access]**
- [T1566 - Phishing]
- [T1566.004 - Voice Phishing (Vishing)]
- **[TA0007 - Discovery]**
- [T1082 - System Information Discovery (Targeting Org Charts/Workflows)]
- **[TA0011 - Command and Control]**
- [T1102 - Web Service (Use of legitimate video conferencing software for delivery)]
- **[TA0002 - Execution]**
- [T1204.001 - User Execution: Malicious Link/Action (Authorization of wire transfers)]
## Functionality
### Core Capabilities
- **Voice Synthesis:** Utilizing Text-to-Speech (TTS) or Speech-to-Speech (STS) models to replicate timbre, pitch, and cadence of a target.
- **Video Manipulation:** Real-time overlay of an executive’s likeness onto an attacker’s face during live video calls.
- **Low Barrier to Entry:** Ability to use public repositories (e.g., GitHub-hosted AI models) that bypass commercial ethical safety filters.
### Advanced Features
- **Reconnaissance Integration:** Attackers map financial approval workflows and "standard operating procedures" (SOPs) to ensure the deepfake request mimics legitimate business urgency.
- **Interactive Personas:** Used in hiring pipelines to bypass video interviews for "Insider Threat" placement.
## Indicators of Compromise
*Note: As this is a social engineering technique, traditional file-based IOCs are often absent.*
- **Behavioral Indicators:**
- Visual artifacts in video calls (blurring around the mouth, unnatural eye blinking, glitches when the face moves to a profile view).
- Unnatural prosody or robotic inflection in high-fidelity audio.
- Unusual urgency combined with a request to bypass standard out-of-band verification.
- Requests for "hiring" or "credential resets" from high-level executives via atypical channels.
## Associated Threat Actors
- **Commercial/Cyber-mercenaries:** Groups leveraging "Deepfake-as-a-Service" platforms.
- **Unspecified Fraud Syndicates:** Globally distributed groups targeting multinational finance departments (e.g., Singapore and Hong Kong incidents).
## Detection Methods
- **Behavioral Detection:** Monitoring for atypical communication patterns or high-value transfers initiated via video/voice call without a secondary digital "paper trail."
- **AI-Liveness Testing:** Implementing challenges during calls (e.g., asking the speaker to turn sideways or hold an object in front of their face, which often breaks deepfake renders).
- **Audio Analysis:** Using specialized software to detect synthetic frequencies or "silence" patterns typical of AI generation.
## Mitigation Strategies
- **Verification Protocols:** Implement mandatory "Out-of-Band" (OOB) verification for any financial transaction (e.g., calling the executive back on a known-good number).
- **Shared Secret Phrases:** Establishing non-digital "safe words" for high-stakes internal authorizations.
- **Security Awareness Training:** Specific modules focused on AI-voice phishing and deepfake recognition.
- **Hardening Procurement:** Establishing multi-signature requirements for any transfer above a specific threshold, regardless of who authorizes it on a call.
## Related Tools/Techniques
- **Vishing (Voice Phishing):** The non-AI predecessor to this technique.
- **BEC (Business Email Compromise):** Often used in tandem with deepfakes to provide "written" follow-up to a fake call.
- **Social Engineering:** The broader tactic of psychological manipulation.