Full Report
Auto-GPT is a command line application for getting a high level description of a goal then breaking it up into sub tasks. This works by taking in the initial text from the user and basing the data to an LLM. Based upon this data, a command will be executed depending on what we ask. This ranges from browsing websites to writing files to executing Python code. The authors took the direction of seeing if incoming input from other mediums besides the users text could be a security threat. So, they focused on browse_website and other functions along these ideas. One idea would be to force a sponsored result to return tainted data that could act as malicious input to the system. When grabbing data from the website, it was passed back into the LLM. So, the data being returned back to the user had to be part of the response from the LLM. TO get around this, they found that data included in a hyperlink was directly included in the output and they used more prompt injection to return arbitrary data as well. Once there, they wanted to convince Auto-GPT to execute arbitrary code. They wanted to make the code as small as possible to ensure that no rewrites happened. Their plan was to use requests to eval a script from the internet. Auto-GPT saw a security issue with this so they used some misdirection with curl to trick the program to thinking that the usage of eval was safe in this case. This level of code execution was within Auto-GPT though. Their goal was to get code execution within the Docker container and not the LLM. They found multiple command that made this trivial: write_to_file and execute_shell were easy to do. There is a catch though: many of these commands require a confirmation from the user. The authors found that ANSI escape sequences would be rendering in the terminal. This could have been used to spoof model statements, which is a pretty awesome bug. At this point, even with code execution, we are still within the container though. The docker file (docker-compose.yml) mounts itself into the container. Because of this, an attacker can write to this in order to escape the container on the next call. There is an additional setup where the python code is executed within a clean docker container with no issues. However, execute_python_code has a directory traversal vulnerability that allows for the modification of python scripts from outside the directory. Overall, a super interesting post that dives into the future. Multi-layer prompt injection to get access to dangerous functionality then abusing this functionality to get code execution. Pretty neat!
Analysis Summary
# Vulnerability: Multi-Layer Prompt Injection, Docker Escape, and Path Traversal in Auto-GPT
## CVE Details
- **CVE ID:** CVE-2023-37272 (ANSI Injection), CVE-2023-37275 (Path Traversal), CVE-2023-37274 (Docker Escape)
- **CVSS Score:** 9.8 (Critical) - *Aggregated score for the exploit chain*
- **CWE:** CWE-502 (Insecure Deserialization/Execution), CWE-22 (Path Traversal), CWE-74 (Injection)
## Affected Systems
- **Products:** Auto-GPT (Significant Gravitas)
- **Versions:** v0.4.0, v0.4.1, and v0.4.2
- **Configurations:**
- Systems running in "Continuous Mode" (No user intervention).
- Systems using Docker with the default `docker-compose.yml` mounting the project directory into the container.
- Default installations using commands like `browse_website` or `execute_python_code`.
## Vulnerability Description
Auto-GPT is susceptible to a multi-stage attack starting with **Indirect Prompt Injection**. When Auto-GPT browses an attacker-controlled website, it parses malicious instructions hidden in the HTML/text.
1. **User Approval Bypass:** In non-continuous mode, attackers use **ANSI escape sequences** to manipulate the terminal output, spoofing "safe" system messages or hiding malicious commands to trick the user into granting execution permission.
2. **Path Traversal:** The `execute_python_code` command (v0.4.1/v0.4.2) fails to properly sanitize the `basename` argument, allowing the LLM to write Python scripts outside the intended workspace via `../` sequences.
3. **Docker Escape:** In self-built Docker environments, the `docker-compose.yml` file and the Auto-GPT source code are mounted inside the container with write permissions. An attacker can overwrite the `docker-compose.yml` or existing Python files to execute commands on the host system upon the next container restart.
## Exploitation
- **Status:** PoC available (demonstrated by Positive Security).
- **Complexity:** Medium (Requires crafting specific prompt injections to guide the LLM's reasoning).
- **Attack Vector:** Network (Remote via malicious website content).
## Impact
- **Confidentiality:** High (Full access to environment variables, API keys, and host files).
- **Integrity:** High (Ability to modify host system files and Docker configurations).
- **Availability:** High (Host system takeover or process termination).
## Remediation
### Patches
- **Update to Auto-GPT v0.4.3 or later.**
- Fixes the ANSI escape sequence injection.
- Resolves the path traversal in `execute_python_code`.
- Updates Docker configurations to prevent container-to-host write access.
### Workarounds
- Disable "Continuous Mode" to ensure a human reviews every command.
- Avoid using `browse_website` on untrusted or non-whitelisted domains.
- Run Auto-GPT in a highly restricted, throwaway VM rather than a persistent local environment.
## Detection
- **Indicators of Compromise:**
- Presence of `\x1b[` (ANSI) sequences in terminal logs or stored command history.
- Unexpected modifications to `docker-compose.yml` or files outside the `auto_gpt_workspace`.
- Outbound connections to unknown IPs from the `execute_python_code` function.
- **Detection methods:** Monitor system calls for unusual file-write operations originating from the Docker container process.
## References
- Positive Security Advisory: hxxps[://]positive[.]security/blog/hacking-autogpt
- GitHub Security Advisories: hxxps[://]github[.]com/Significant-Gravitas/Auto-GPT/security/advisories
- Official Repository: hxxps[://]github[.]com/Significant-Gravitas/Auto-GPT