Full Report
A employee at Elon Musk's artificial intelligence company xAI leaked a private key on GitHub that for the past two months could have allowed anyone to query private xAI large language models (LLMs) which appear to have been custom made for working with internal data from Musk's companies, including SpaceX, Tesla and Twitter/X, KrebsOnSecurity has learned.
Analysis Summary
# Incident Report: Exposure of xAI Private LLM API Key on GitHub
## Executive Summary
An employee at xAI inadvertently exposed a private API key for their large language models (LLMs) on a public GitHub repository, granting unauthorized read access for two months. This key provided access to unreleased and private Grok models, some of which were fine-tuned on sensitive internal data from SpaceX and Tesla. The exposure was eventually discovered by a security researcher and reported, leading to the key's revocation.
## Incident Details
- Discovery Date: April 30 (when GitGuardian alerted xAI security team)
- Incident Date: March 2 (when GitGuardian first alerted the xAI employee) - April 30 (when key was removed)
- Affected Organization: xAI (with implications for SpaceX, Tesla, and Twitter/X)
- Sector: Artificial Intelligence, Technology
- Geography: Not explicitly disclosed, based in the US/Global operations.
## Timeline of Events
### Initial Access
- **Date/Time:** On or before March 2.
- **Vector:** Accidental leak/Developer error.
- **Details:** A technical staff member at xAI committed code containing an API credential for an x.ai application programming interface (API) to a public GitHub repository.
### Lateral Movement
- **Details:** The exposed API key provided access to multiple private xAI LLM instances, including those developed for internal projects, indicating unauthorized access to internal development resources rather than movement across internal network infrastructure.
### Data Exfiltration/Impact
- **Details:** The key granted access to at least 60 distinct datasets and several private/unreleased Grok models, including those fine-tuned on proprietary data from SpaceX and Tesla. While there is no indication of federal government data access, proprietary corporate developmental data exposure is a high risk.
### Detection & Response
- **How it was discovered:** Philippe Caturegli (Seralys) noticed the exposure, bringing it to the attention of GitGuardian researchers. GitGuardian systems had previously alerted the original developer on March 2.
- **Response actions taken:** GitGuardian alerted xAI's security team on April 30. xAI instructed GitGuardian to report via their HackerOne bug bounty program, and the repository containing the key was removed shortly thereafter.
## Attack Methodology
- **Initial Access:** Inadvertent repository commit by an employee (Human Error).
- **Persistence:** The key remained active and valid for approximately two months (March 2 to April 30).
- **Privilege Escalation:** Not applicable (The key already provided high-level access to specific models/APIs).
- **Defense Evasion:** The exposure was passive (a public code leak); active evasion techniques were not necessary for initial access.
- **Credential Access:** Credential discovery via public code monitoring tools (scanning GitHub).
- **Discovery:** No specific internal reconnaissance detailed, but the key provided access to development models.
- **Lateral Movement:** Access to multiple distinct internal LLM assets.
- **Collection:** Access to the back-end interface of private LLMs, potentially allowing model querying and data inference.
- **Exfiltration:** Not explicitly confirmed, but potential for malicious querying or prompt injection to extract trained data or compromise the supply chain.
- **Impact:** Exposure of proprietary AI development assets and trained datasets.
## Impact Assessment
- **Financial:** Not publicly detailed, but costs associated with incident response, securing intellectual property, and potential regulatory scrutiny.
- **Data Breach:** Exposure of proprietary development data used to fine-tune private LLMs (SpaceX, Tesla data). At least 60 distinct datasets were accessible.
- **Operational:** Potential for prompt injection attacks or supply chain compromise if the models were further weaponized. Disruption to xAI's internal AI development security posture.
- **Reputational:** Negative press regarding security maturity at Elon Musk's companies, especially concerning the handling of sensitive data within AI frameworks.
## Indicators of Compromise
- **Network indicators:** (N/A - The exposure was static code, not active malware communications. Any subsequent querying would use legitimate API endpoints.)
- **File indicators:** The specific Git commit repository containing the API key (removed).
- **Behavioral indicators:** Unauthorized access/queries against unreleased Grok model endpoints (e.g., `grok-2.5V`, `research-grok-2p5v-1018`, `tweet-rejector`, `grok-spacex-2024-11-04`).
## Response Actions
- **Containment measures:** The repository containing the exposed key was removed/taken offline shortly after xAI was notified. The compromised API key was likely revoked immediately following the report.
- **Eradication steps:** Key rotation and potentially forensic review of any activity logged against the compromised API credentials during the two-month window.
- **Recovery actions:** Verification that all other private models and associated credentials are secure and not similarly exposed.
## Lessons Learned
- **Key Takeaways:** The two-month delay between the initial automated alert (March 2) and direct company notification (April 30) highlights a significant failure in the developer's process for addressing security alerts. Strong secrets management policies are not being consistently enforced or adhered to, leading to public code exposure.
- **What could have been done better:** Immediate security alert response mandatory follow-up, mandatory use of secrets scanning/pre-commit hooks, and automated credential rotation capabilities.
## Recommendations
- Implement mandatory enforcement of secrets scanning tools (e.g., pre-commit hooks, continuous repository scanning) that automatically revoke access or alert security teams *before* secrets are pushed publicly.
- Significantly enhance monitoring and alert handling processes following automated secret exposure notifications (like those from GitGuardian).
- Review and segment access permissions for all LLM training data, especially proprietary data from subsidiaries like SpaceX and Tesla.
- Establish clear, non-bypassable guidelines for developer access credentials, potentially using short-lived, dynamically generated tokens instead of long-lived API keys.