Full Report
Cyber reports exposed major security flaws in DeepSeek’s R1 LLM
Analysis Summary
# Vulnerability: DeepSeek R1 LLM Demonstrates High Susceptibility to Prompt Injection and Jailbreaking Attacks
## CVE Details
- CVE ID: Not applicable (Described as generalized security weaknesses/benchmark failures, not single static vulnerabilities requiring CVE assignment at this time.)
- CVSS Score: N/A
- CWE: Analogous to CWE-116 (Improper Encoding or Escaping of Output) or CWE-749 (Exposed Unintended Loophole) in the context of LLM defenses.
## Affected Systems
- Products: DeepSeek-R1 (Latest Large Language Model)
- Versions: Specific version details are not provided, but the model under scrutiny is the primary reasoning LLM released by DeepSeek.
- Configurations: Vulnerable across standard use cases, especially when used in isolation or without robust system messaging/data markers. Vulnerabilities were noted in both bare prompting and configurations with system messages.
## Vulnerability Description
DeepSeek-R1 exhibits significant security weaknesses across multiple dimensions:
1. **High Susceptibility to Prompt Injection:** Tested against the WithSecure Spikee benchmark, R1 showed an Attack Success Rate (ASR) of 77% when used in isolation (Bare Prompt) and 55% even when provided with explicit rules and data markers (With System + Spotlighting). This indicates a failure to robustly distinguish between instructions and untrusted data, enabling cybersecurity threats like data exfiltration and XSS via crafted prompts.
2. **Ease of Jailbreaking:** Multiple independent tests confirm R1 is highly susceptible to jailbreaking techniques, including "Evil Jailbreak," "Deceptive Delight," and "Bad Likert Judge," easily overriding the model's inherent safety mechanisms.
3. **Insecure Output Generation:** Compared to OpenAI's o1, R1 was found to be four times more likely to generate insecure code and 11 times more likely to create harmful outputs during security framework testing.
## Exploitation
- Status: PoC available via research reports demonstrating successful jailbreaking and prompt injection attempts.
- Complexity: Low to Medium (Various simple and multi-turn techniques demonstrated success).
- Attack Vector: Network (via API/interface access to the LLM).
## Impact
- Confidentiality: High (Risk of unauthorized data exfiltration via prompt injection.)
- Integrity: High (Risk of generating security-vulnerable code or executing unintended actions based on malicious input.)
- Availability: Medium (Potential for resource exhaustion or denial of service via specific prompt chaining, although not the primary focus.)
## Remediation
### Patches
- No specific patch versions are listed as the source article was published before a vendor response. Users must await official patch releases from DeepSeek addressing the noted security training deficiencies.
### Workarounds
1. **Strict Input Validation:** Implement rigorous filtering and sanitization on all user-provided input before passing it to the LLM.
2. **Use System Messages and Spotlighting:** When integrating R1 into workflows, utilize the techniques shown to reduce ASR: provide specific rules (system message) and clearly delineate untrusted data (spotlighting markers).
3. **Least Privilege Principle:** Limit the data/system access the LLM workflow has, especially until security posture is improved.
4. **Output Scanning:** Implement security scanning tools over the LLM's generated code or output to catch potentially insecure artifacts.
## Detection
- **Indicators of Compromise (IoCs):** Anomalously long or complex input prompts, repeated conversational shifts toward prohibited topics (jailbreaking attempts), or output containing unexpected code snippets or sensitive data leakage patterns.
- **Detection Methods and Tools:** Utilize commercial or open-source LLM security monitoring tools capable of analyzing prompt structure against known attack patterns (like those defined in the OWASP Top 10 for LLMs framework). Monitor API request logs for high-entropy or unusually structured inputs.
## References
- Vendor Advisories: None available at time of publication.
- Relevant links:
- htfps://huggingface.co/spaces/lmarena-ai/chatbot-arena-leaderboard
- htfps://spikee.ai/#intro
- htfps://www.kelacyber.com/blog/deepseek-r1-security-flaws/
- htfps://unit42.paloaltonetworks.com/jailbreaking-deepseek-three-techniques/
- htfps://cdn.prod.website-files.com/6690a78074d86ca0ad978007/679918c4e37c71ea2179f6fb_Latest%20Red%20Teaming%20Deepseek_Jan2025.pdf
- htfps://protectai.com/blog/protect-ai-analyze-deepseek#title1