Full Report
Contest platforms in web3 are an alternative to standard security reviews. The auditing firm Zellic bought the contest platform CodeArena last year and has decided to write a report on metrics for audit contest information. Naturally, the platform wants to look better, but you've got to know when things are snake-oily. Imo, some of this just feels like a competitor bash (especially the screenshots that are obvious to know who they're calling out), but there are some good points. The true benefit of audit competitions is the number of eyes and skills your code gets. A traditional audit is a low-hanging-fruit audit with known entities. In a contest, the participants are incentivized to find unique and high-impact issues. So, the coverage is better theoretically. The first metric is finding count. Many of the findings reports include invalid issues or don't de-duplicate the issues to inflate the numbers. Teams only care about the high and medium severity bugs. The next metric is participant numbers. There's a difference between participants and useful participants. Using a number with "participants who submitted a valid finding" would be much better. It's also hard to know how much time was actually spent on the code for those participants. However, this final point is true on all platforms. The third one is "Claims about exclusivity". A general issue with contests is how you know good researchers are looking at your code. At Cantina, they have pre-paid folks to work on audits. On Sherlock, they have a Lead Watson who gets an automatic part of the prize pool. Having full-time people on your platform is better than not having it at all. There's a concern if these folks are actually spending the time on your project. If they weren't, they would probably lose their contract with the contest platform. Their concerns are valid (are they on it the entire time, who is managing this, etc) but some it better than the none of C4. Comparisons between audit contests and traditional audits are usually somewhat confusing. Severity scales are different, "fake" vulnerabilities are sometimes presented in both cases, and asymmetric comparisons are made on codebases that are either different projects or audited at different points. This is a fair call out. It's good to consider the differences between the platforms to decide where to host competitions and participate as a hacker. This article has some good points but also has a very skewed perspective with A) being an auditing firm and B) owning C4. So, take the content with a grain of salt.
Analysis Summary
# Best Practices: Selecting and Evaluating Audit Competitions
## Overview
Audit competitions (or "competitive audits") are a hybrid security model combining time-boxed traditional reviews with the crowdsourced "many-eyes" approach of bug bounties. These practices address the need for high-impact vulnerability discovery while navigating the potential for misleading performance metrics and "snake-oil" sales tactics in the Web3 security space.
## Key Recommendations
### Immediate Actions
1. **Demand De-duplicated Data:** When reviewing a platform's track record, require metrics that count only *unique* vulnerabilities, not the total number of submissions (which often includes duplicates of the same bug).
2. **Filter for Quality:** Focus exclusively on High and Medium severity findings. Disregard "Low" or "Informational" counts, as these are often used to inflate volume without increasing security posture.
3. **Verify Participant Activity:** Ask for the number of researchers who submitted at least one *valid* finding, rather than the total number of "registered" users.
### Short-term Improvements (1-3 months)
1. **Normalize Severity Scales:** Before starting a competition, align the platform’s severity definitions (High/Medium/Low) with your organization's internal risk framework to ensure reported issues are actionable.
2. **Cross-Reference Audit Reports:** Request full audit reports from previous competitions of similar size (~SLoC) to verify the quality of descriptions and the validity of the findings.
3. **Evaluate Reviewer Incentives:** Determine if the platform utilizes "Lead" auditors or guaranteed participants (e.g., Sherlock’s Lead Watson or Cantina’s managed researchers) to ensure a baseline level of professional coverage.
### Long-term Strategy (3+ months)
1. **Hybrid Security Pipeline:** Integrate audit competitions as a secondary layer after a traditional consultative audit. Use traditional audits for "low-hanging fruit" and architectural review, and competitions for "edge-case" and high-complexity bug hunting.
2. **Escrow and Refund Policies:** Negotiate "Conditional Pools" where a portion of the prize pool is returned if no high-severity bugs are found, ensuring the platform is incentivized to attract top-tier talent.
## Implementation Guidance
### For Small Organizations
- **Prioritize Fixed-Cost Pools:** Use competitions to get maximum "eyes on code" without the variable overhead of a long-term bug bounty.
- **Focus on SLoC:** Ensure your codebase is small enough (~3,000 SLoC) to be effectively covered by a standard prize pool.
### For Medium Organizations
- **Review Managed Options:** Opt for platforms that provide a "Lead" or managed researcher to ensure that even if the crowd misses something, a dedicated expert has reviewed the core logic.
- **Audit the Auditors:** Dedicate internal engineering time to "judge" or dispute invalid findings during the competition's mitigation phase.
### For Large Enterprises
- **Comparative Benchmarking:** Do not compare platforms based on raw bug counts. Compare based on "Criticals found per dollar spent" and "Researcher reputation."
- **Staged Rollouts:** Run competitions on specific, high-risk modules rather than the entire monolithic codebase to prevent researcher fatigue.
## Common Pitfalls to Avoid
- **The "Participation" Trap:** Counting anyone who "joined" the competition Discord or clicked "Register" as a researcher.
- **Inflationary SLoC:** Being impressed by a platform’s claim of finding a bug every 10–20 lines; this usually indicates a lack of de-duplication or a focus on trivial issues.
- **Anonymous Testimonials:** Giving weight to unsourced praise. Always ask for verifiable case studies.
- **Asymmetric Comparisons:** Comparing an older version of your code audited by a firm to a newer, more stable version in a competition (or vice-versa).
## Compliance Alignment
- **NIST SP 800-53:** Aligns with Vulnerability Monitoring and Scanning (RA-5) and Penetration Testing (CA-8).
- **ISO/IEC 27001:** Supports Annex A.12.6.1 (Management of technical vulnerabilities).
- **SOC2:** Provides evidence for Risk Assessment and Monitoring Activities within the Common Criteria.
## Resources
- **Code4rena [c4-defanged]:** Competitive audit platform metrics.
- **Sherlock [sherlock-defanged]:** Smart contract audit marketplace.
- **Cantina [cantina-defanged]:** Managed security research platform.
- **Immunefi [immunefi-defanged]:** Bug bounty platform for comparative research.