Full Report
Google Threat Intelligence Group (GTIG) has published a new report warning about AI model extraction/distillation attacks, in which private-sector firms and researchers use legitimate API access to systematically probe models and replicate their logic and reasoning. [...]
Analysis Summary
# Tool/Technique: HonestCue
## Overview
HonestCue is a proof-of-concept malware framework observed leveraging the Gemini API to generate C# code for second-stage malware, which is then compiled and executed in memory. It represents an example of threat actors integrating LLM capabilities into their malware development lifecycle.
## Technical Details
- Type: Malware Framework
- Platform: Unknown (Uses C# for second-stage payloads, suggesting Windows/cross-platform compilation)
- Capabilities: Generates second-stage malicious C# code via an LLM API, compiles payloads in memory, executes payloads.
- First Seen: Late 2025
## MITRE ATT&CK Mapping
This tool integrates across multiple attack stages supported by LLM usage, including development and execution.
- **TA0001 - Initial Access**
- **T1588 - Obtain Capabilities** (If used to research or generate exploits/tools)
- **TA0002 - Execution**
- **T1059 - Command and Scripting Interpreter** (When compiling and executing C# in memory)
- **TA0005 - Defense Evasion**
- **T1027 - Obfuscated Files or Information** (Potential if compilation/in-memory execution obscures static analysis)
## Functionality
### Core Capabilities
- Utilizes the Gemini API to automate the generation of C# code for subsequent malicious payloads.
- Compiles the generated C# code.
- Executes the compiled payloads directly in memory (fileless execution of the second stage).
### Advanced Features
- Direct integration of cutting-edge LLM capabilities (Gemini API) into the malware workflow to accelerate development and code generation.
## Indicators of Compromise
- File Hashes: N/A (Focus is on the framework/methodology)
- File Names: N/A
- Registry Keys: N/A
- Network Indicators: Use of the Gemini API endpoints (`api.google.com` equivalent, but specific endpoint for abuse is not detailed).
- Behavioral Indicators: Processes dynamically compiling and executing C# code generated from external API interactions.
## Associated Threat Actors
- Mentioned generally in the context of threat actor abuse integrated into existing malware families (though specific threat actors using HonestCue aren't pinpointed as clearly as for CoinBait).
## Detection Methods
- Detection focuses on the anomalous use of APIs in conjunction with code compilation:
- Identifying network traffic to legitimate LLM providers originating from potentially compromised hosts that subsequently exhibit code compilation/execution behaviors.
- Signature-based detection on C# code snippets known to be generated by LLMs if they match known malicious patterns.
- Behavioral detection monitoring for in-memory compilation chains originating from unexpected processes.
## Mitigation Strategies
- **API Usage Monitoring:** Implement strict monitoring and rate limiting on API key usage, especially if keys are potentially exposed or misused by internal tooling.
- **Code Execution Hardening:** Employ application control solutions to restrict or heavily scrutinize in-memory compilation and execution of dynamically generated code.
- **Secure Development Lifecycle:** Ensure development environments are segregated from production/sensitive zones to prevent potential API credential leakage.
## Related Tools/Techniques
- CoinBait phishing kit (also noted to use AI code generation tools like Lovable AI).
- General use of LLMs for reconnaissance, phishing lure creation, C2 development, and vulnerability testing.
# Tool/Technique: CoinBait Phishing Kit
## Overview
CoinBait is a phishing kit designed to masquerade as a cryptocurrency exchange. Analysis indicates its development was significantly accelerated using AI code generation tools, evidenced by specific artifacts found in the source code.
## Technical Details
- Type: Phishing Kit / Tool
- Platform: Web-based (React SPA wrapper suggests client-side rendering)
- Capabilities: Credential harvesting, masquerading as a cryptocurrency exchange. Indicators suggest LLM assistance in development.
- First Seen: Unknown, but tied to recent AI abuse trends.
## MITRE ATT&CK Mapping
- **TA0001 - Initial Access**
- **T1566 - Phishing**
- **T1566.001 - Spearphishing Attachment** (If distributed via attachment) or **T1566.002 - Spearphishing Link** (If sent via email/message)
## Functionality
### Core Capabilities
- Mimics cryptocurrency exchange interfaces to trick users into providing credentials.
- Focus on React Single Page Application (SPA) structure wrapped around the malicious interface.
### Advanced Features
- The code base contains specific artifacts indicating development via AI platforms, specifically logging messages prefixed with "Analytics:," which researchers believe aids defenders in tracking data exfiltration processes post-development.
- Strong belief that the development utilized the **Lovable AI platform** (evidenced by Lovable Supabase client and lovable.app artifacts).
## Indicators of Compromise
- File Hashes: N/A
- File Names: N/A
- Registry Keys: N/A
- Network Indicators: N/A (Standard phishing infrastructure expected)
- Behavioral Indicators: Presence of "Analytics:" prefixed logging messages in the source code of web assets, development artifacts pointing to Lovable AI services.
## Associated Threat Actors
- General cybercriminals leveraging AI assistance.
## Detection Methods
- **Source Code Analysis:** YARA or tooling to scan web assets for known AI-generated artifacts, such as the "Analytics:" prefix in code comments or log statements.
- **Infrastructure Detection:** Rule creation targeting infrastructure hosting known CoinBait domains.
## Mitigation Strategies
- **User Education:** Train users to verify the legitimacy of cryptocurrency exchange sites via known, verified channels rather than clicking unsolicited links.
- **WAF/Email Filtering:** Enhanced inspection for known phishing patterns associated with cryptocurrency themes.
## Related Tools/Techniques
- LLM-assisted development generally.
- ClickFix campaigns (another example of social engineering aided by generative AI).
# Technique: AI Model Extraction / Knowledge Distillation
## Overview
This technique involves systematically probing a target Large Language Model (LLM)—such as Google's Gemini—using authorized API access to replicate its logic, decision-making processes, and performance characteristics. The resulting data is then used to train a new, smaller model via knowledge distillation.
## Technical Details
- Type: Technique
- Platform: Cloud/API-based LLMs
- Capabilities: Replicates proprietary model functionality cheaply and quickly, intellectual property theft, model architecture discovery.
- First Seen: Described in a recent GTIG report (Feb 2026 context).
## MITRE ATT&CK Mapping
This maps primarily to adversary efforts to gain unauthorized capabilities or steal intellectual property related to software resource development.
- **TA0010 - Exfiltration**
- **T1041 - Exfiltration Over C2 Channel** (Data extracted via prompt responses)
- **TA0011 - Defense Evasion** (By creating a functionally equivalent, but non-monitored, internal model)
- **Resource Development (Enterprise)**
- **T1608 - Development Environment Artifacts** (While not strictly TTP, this represents developing a competing/stolen model asset)
## Functionality
### Core Capabilities
- Systematic querying of the target LLM using thousands or millions of prompts to gather input-output pairs.
- Accelerating their own AI model development timeline significantly by standing upon the work done by the target provider.
- Significant cost reduction for training new models.
### Advanced Features
- **Knowledge Distillation:** The process of transferring the "knowledge" (the learned mapping functions) from the large, complex "teacher" model (Gemini) to a smaller, faster "student" model.
- Scalability: Demonstrated via a large-scale attack involving 100,000 prompts aimed at replicating reasoning across non-English languages.
## Indicators of Compromise
- File Hashes: N/A
- File Names: N/A
- Registry Keys: N/A
- Network Indicators: High-volume, structured programmatic API calls to LLM endpoints that show patterned probing sequences rather than natural language queries.
- Behavioral Indicators: Automated, large-scale querying patterns designed to test comprehensive input space against the model.
## Associated Threat Actors
- Private-sector firms and researchers (commercial/competitive motivation).
- State-backed actors (for general capability enhancement).
## Detection Methods
- **API Rate Limiting & Anomaly Detection:** Detecting sudden, massive spikes in prompt submissions from a single API key, particularly if the prompts follow structured testing formats (as opposed to conversational dialogue).
- **Defense Implementation:** Google implemented customized defenses in Gemini’s classifiers to specifically make this abuse harder.
## Mitigation Strategies
- **Strict API Access Control:** Implementing robust authentication and compartmentalization for API keys.
- **Rate Limiting:** Enforcing strict rate limits based on usage patterns that deviate from standard usage profiles.
- **Prompt Classification:** Enhancing classifiers not just for dangerous content, but for content indicative of extraction attempts (e.g., structured, repetitive, or comparative queries).
## Related Tools/Techniques
- Other specialized AI tools or scripts used to automate prompt generation and result parsing for distillation purposes.