Full Report
Over a dozen programs used by creators of nonconsensual explicit images have evaded detection on the developer platform, WIRED has found.
Analysis Summary
# Incident Report: Persistence of Deepfake Porn Generation Tools on GitHub
## Executive Summary
This report summarizes the ongoing security and policy failure by GitHub to effectively enforce its ban on projects used to create nonconsensual explicit deepfake material. Despite implementing rules in June 2024, numerous malicious deepfake generation programs continue to evade detection, often reappearing in archived formats or new repositories, demonstrating systemic weaknesses in content moderation for open-source code. The primary impact is the continued availability of tools for creating intimate image abuse material, affecting public figures whose likenesses are being exploited.
## Incident Details
- Discovery Date: Ongoing (WIRED investigation identified issues continually, with a specific recent example noted in late November 2024)
- Incident Date: Rules banning the content were implemented in June 2024, but the evasion continues from that point forward.
- Affected Organization: GitHub Inc. (As the hosting platform). Individual victims whose likenesses are used (e.g., Charli D’Amelio).
- Sector: Technology/Software Development Platform
- Geography: Global (Relates to content hosted on GitHub platforms accessible worldwide)
## Timeline of Events
### Initial Access
- Date/Time: Rules implemented in June 2024, indicating the initial opportunity for exposure began before this date, with continued abuse following enforcement attempts.
- Vector: Uploading and hosting source code/programs designed for deepfake generation, specifically for nonconsensual sexual imagery, onto the GitHub platform.
- Details: A program creator ("DeepWorld23") uploaded a sexually explicit deepfake video featuring an influencer. The underlying program was previously disabled in August 2024 but reappeared in November 2024 in an archived format, allowing access to the harmful code.
### Lateral Movement
- Not directly applicable to typical network lateral movement. The "movement" here is the proliferation of the harmful code across repositories (original, archived, and forked copies) on the platform despite takedown attempts.
### Data Exfiltration/Impact
- The direct "data exfiltration" is the creation and hosting of nonconsensual intimate imagery (NCII) using the available code, which is then distributed and viewed across the platform (e.g., the Charli D’Amelio deepfake showing over 8,200 views).
### Detection & Response
- **Detection:** WIRED investigation found over a dozen active GitHub projects linked to deepfake pornography evading detection. Previously, a specific program was disabled in August 2024 following policy implementation.
- **Response Actions:** GitHub introduced rules in June 2024 banning projects used to synthetically create nonconsensual sexual images. Specific projects (like the one generating the D’Amelio deepfake) were disabled in storage/active format in August 2024, but the code persisted in archived formats accessible in November 2024.
## Attack Methodology
The methodology here relates to *policy evasion* rather than a traditional cyber-attack against GitHub's internal systems.
- **Initial Access:** Utilizing GitHub's platform to host and share development codebases.
- **Persistence:** Storing the malicious code in alternative formats (e.g., archived repositories) after initial takedown efforts, allowing continued access or easy reactivation.
- **Privilege Escalation:** Not applicable.
- **Defense Evasion:** Exploiting blind spots in automated or manual moderation by hosting code that facilitates abuse, potentially using obfuscation or subtle naming conventions that bypass keyword detection.
- **Credential Access:** Not applicable.
- **Discovery:** Not applicable (No internal reconnaissance observed).
- **Lateral Movement:** Not applicable (Code/content moved across repository states/formats).
- **Collection:** The code itself serves as the tool for "collection" (i.e., generating the synthetic media based on source images).
- **Exfiltration:** The resulting deepfake media is publicly distributed via the platform.
- **Impact:** Enabling the widespread creation and dissemination of nonconsensual explicit imagery (Intimate Image Abuse).
## Impact Assessment
- **Financial:** Not quantified in the provided context, but reputational damage and costs associated with content moderation are implied.
- **Data Breach:** No traditional customer data breach reported. The impact involves the unauthorized creation and distribution of explicit likenesses of individuals.
- **Operational:** GitHub's reputation as a secure, policy-compliant code host is damaged due to the persistence of banned material.
- **Reputational:** Negative press regarding GitHub's "incomplete" crackdown and failure to fully remove harmful content.
## Indicators of Compromise
Given this is a policy violation/content hosting issue, IoCs are related to the content/code itself, not a system compromise:
- **Network indicators:** (None provided, as this focuses on platform content)
- **File indicators:** Presence of repositories containing archived codebases linked to deepfake pornography creation.
- **Behavioral indicators:** User comments soliciting information about deepfake creation programs ("What program did you use for creating the deepfake??").
## Response Actions
- **Containment:** Initial disabling of specific prohibited projects (August 2024).
- **Eradication:** Policy update in June 2024 banning projects for synthetically creating nonconsensual sexual images. (However, eradication appears incomplete due to archival persistence).
- **Recovery:** Not applicable (No system recovery needed, but policy enforcement remediation is ongoing).
## Lessons Learned
- **Policy Loopholes:** Relying solely on initial takedowns is insufficient; harmful code can persist through archival features and be easily redeployed or accessed.
- **Moderation Challenges:** Moderating open-source material, especially code designed for a specific malicious function, presents significant challenges, even with clear red flags.
- **Speed of Abuse:** The speed at which malicious actors can repost or find an alternative channel (like an archived format) requires continuous, proactive monitoring rather than reactive removal.
## Recommendations
- **Enhanced Archival Scanning:** Implement stricter, recurring automated scanning and review processes for all archived repositories where known policy-violating code was previously hosted or mentioned.
- **Proactive Code Analysis:** Improve machine learning models to detect code patterns/libraries associated with deepfake generation, regardless of repository status or minor naming variations.
- **Improved Takedown Verification:** When a project is disabled, verify that all derivative or archived copies referenced within the platform have also been subject to remediation steps.