Full Report
The author of this post was curious about the various AI-native security scanners. They wanted to find a product on the market that could identify vulnerabilities in code during a code review today. So, they tried numerous products, learned how they worked, and came up with this blog post. Surprisingly, AI security auditors are advertised everywhere but can actually be found nowhere. All of the products tested had a very similar set of offerings. Full code, branch, PR/R scans. ZeroPath has a SOC2 report generator. Some of them have hooks for things like GitHub actions, bot guidance to developers, response to PRs and IDE plugins, naturally. Finally, they all support auto-fix/remediation guidance as well. The first step is to ingest all the code and index it appropriately. Once it's uploaded, the context necessary for the LLM to scan and understand the code can be attempted. Extra context for the types of issues to find could be added for scans as well. The next part is more of the "secret sauce". Asking an LLM to find all vulnerabilities won't be very helpful. So, how does it find the particular code to focus on? The tool could ask for function-by-function or file-by-file analysis. Some use CodeQL permissive queries, opengrep or any other AST traversing of the application. Once it has a candidate vuln, it will perform analysis to see if it's real or not via more detailed analysis. The final stage involves reporting vulnerabilities, which includes detecting false positives and de-duplication. According to them, the tools didn't report as many false positives as traditional SAST tools. Some of them were better or worse at specific languages. Some were better at particular vulnerability classes. Gecko and Amplify were very bad with no real bugs found. Almanax was very inconsistent - it would sometimes find basic bugs and other times it wouldn't. It was very good at very deliberate backdoored code though. Corgea found about 80% of purposely vulnerable code that was scanned. It had about a 50% false positive rate which isn't really that bad though. The language made a huge difference on the quality for this tool. ZeroPath, according to the author, found 100% of the vulnerabilities in the Corpora. Additionally, it identified legitimate bugs in real-world codebases, including curl and sudo. Most of the real-world bugs weren't security issues, but bugs nonetheless. This was the best tool of the bunch. Some takeaways: The biggest benefit is around surfacing inconsistencies between developer intent and the actual implementation. The tools were good at finding business logic issues. They may replace pentesters in the longterm, or at least supplement them. For things without millions of dollars on the line, they are already a good fit. I really like the tone of the article and the perspective of seeing the AI as a helper. For instance, mentioning that while the AI does miss bugs, so do humans. The comparisons are realistic, which I appreciate. Good article!
Analysis Summary
# Tool/Technique: AI-Native Security Scanners (AI SAST)
## Overview
AI-native Static Application Security Testing (SAST) tools, also referred to as "AI Security Engineers" or "LLM Security Scanners," are a new generation of vulnerability detection products. Unlike traditional SAST that relies on rigid, predefined rule sets, these tools leverage Large Language Models (LLMs) to ingest, index, and "reason" about source code to identify security vulnerabilities, malicious intent, and complex logic bugs.
## Technical Details
- **Type:** Attack Tool / Defensive Security Framework
- **Platform:** Agnostic (Cloud-based/SaaS integration with GitHub, GitLab, IDEs)
- **Capabilities:** Deep code analysis, business logic flaw detection, auto-remediation/fixing, reachability analysis for SCA, and false positive reduction.
- **First Seen:** Evaluated as a mature product category circa late 2024 - 2025.
## MITRE ATT&CK Mapping
*Note: While these are defensive tools, they mirror the "Reconnaissance" and "Develop Capabilities" phases used by adversaries for automated vulnerability research.*
- **[TA0043 - Reconnaissance]**
- **[T1592.002 - Gather Victim Profile Information: Software Libraries]** (SCA Analysis)
- **[TA0001 - Initial Access]**
- **[T1190 - Exploit Public-Facing Application]** (Used to find the zero-days for this tactic)
- **[TA0008 - Lateral Movement]**
- **[T1210 - Exploitation of Remote Services]** (Identifying internal lateral movement vectors)
## Functionality
### Core Capabilities
- **Code Ingestion & Indexing:** Direct integration via PR/Branch scans to build a searchable context library of the codebase.
- **Contextual Analysis:** Uses LLMs to understand "developer intent" versus actual implementation.
- **Vulnerability Identification:** Detects traditional flaws (Buffer overflows, SQLi) and non-traditional flaws (Business logic errors, inconsistent state handling).
- **Auto-Remediation:** Generates code patches or remediation guidance specifically tailored to the identified bug.
### Advanced Features
- **Reachability Analysis:** Determines if a vulnerability in a third-party dependency (SCA) is actually reachable and exploitable within the specific context of the application.
- **Indeterministic Auditing:** Unlike static tools, these can find different, "creative" paths to exploitation by simulating a "schizophrenic auditor" mindset.
- **Backdoor Detection:** High efficacy in identifying deliberately planted malicious code or backdoors (noted specifically in Almanax).
## Indicators of Compromise
*As these are legitimate security tools, "IOCs" in this context refer to artifacts of their integration:*
- **Network Indicators:**
- `zeropath[.]io`
- `corgea[.]com`
- `almanax[.]ai`
- **Behavioral Indicators:** Automated commits or Pull Request comments originating from bot accounts (e.g., GitHub Actions bots).
## Associated Threat Actors
- **Security Research Teams:** Used for "Scale-out" bug hunting.
- **Red Teams / Penetration Testers:** Used to accelerate the discovery phase of an engagement.
- **Development Teams:** Integrated into CI/CD for DevSecOps.
## Detection Methods
- **Behavioral Detection:** Monitoring for high-frequency API calls to LLM providers containing proprietary source code snippets.
- **Audit Logs:** Reviewing repository access logs for unauthorized AI-scanner integration or large-scale code exfiltration for "indexing" purposes.
## Mitigation Strategies
- **Data Sovereignty:** Ensure AI scanners utilize "Zero Data Retention" policies to prevent source code from training public models.
- **Human-in-the-Loop:** Valdiate AI-generated fixes before deployment to ensure the "fix" does not introduce new logic flaws.
## Related Tools/Techniques
- **ZeroPath:** Highest efficacy; found vulnerabilities in `curl` and `sudo`.
- **Corgea:** High detection rate (~80%) but higher false positive rate (~50%).
- **Almanax:** Highly effective at finding backdoors/deliberate malicious code.
- **Traditional SAST/SCA:** Snyk, Checkmarx, CodeQL (often used by AI tools as a first-pass "filter").
- **Fuzzing:** AFL-fuzz (cited as the previous major milestone in automated bug hunting).