Full Report
Every now and then you run into a new file format and you find that you may not have a tool to parse that file. Or you are looking for an easy to use solution for you mom to access the photo’s you sent her in a .tar archive. This is where file conversion services come in, a quick Google for “online file converter” will yield multiple results. One thing to keep in mind when converting files, is that different file formats may support different features.
Analysis Summary
# Tool/Technique: Abuse of File Conversion Services via Symlinks
## Overview
This technique describes using the differing implementations of symbolic link (symlink) handling across various archive file formats (e.g., ZIP, TAR, RAR) to trick an online file conversion service into extracting the content of linked files into the destination archive. This effectively leverages the conversion service as an intermediary to achieve arbitrary file read on the server hosting the service.
## Technical Details
- Type: Technique
- Platform: Linux/Unix-like systems (where symlinks are commonly used and manageable via archive options, e.g., Linux hosts running the conversion service).
- Capabilities: Arbitrary File Read, potential Denial of Service (via overwriting files).
- First Seen: The article was published on 01 October 2015, suggesting the technique was prevalent around that time.
## MITRE ATT&CK Mapping
- T1190 - Exploit Public-Facing Application
- T1190.004 - Web Application
- T1083 - File and Directory Discovery
- T1083.001 - File Discovery
- T1005 - Data from Local System
- T1005.001 - Data from Files
## Functionality
### Core Capabilities
- **Archive Preparation:** Creating an archive (like a ZIP file) containing symbolic links that point to sensitive local files (e.g., `/etc/passwd`, files in `/proc`).
- **Symlink Preservation:** Using specific flags (e.g., `zip --symlinks`) during archive creation to ensure only the link reference, and not the target file content, is physically stored in the source archive.
- **Conversion Abuse:** Uploading the prepared archive to an online file conversion service that converts it to a format (like RAR) that *does not* support symlinks, forcing the conversion server to resolve the links and include the target file's content in the resulting archive.
- **Data Exfiltration:** Downloading the newly converted archive, which now contains the contents of the formerly linked files, achieving arbitrary file read.
### Advanced Features
- **Directory Symlinking:** Successfully linking archives to directories (e.g., symlinking `/tmp`) which, upon conversion, might include the contents of that directory structure, potentially increasing the scope of information disclosure or leading to a DoS if critical files are overwritten during a theoretical write-back scenario.
## Indicators of Compromise
- File Hashes: N/A (Relies on user-crafted files)
- File Names: User-created artifacts like `/tmp/links`, `linktopasswd`, `archive.zip`.
- Registry Keys: N/A
- Network Indicators: N/A (The attack vector primarily revolves around file uploading to third-party conversion endpoints.)
- Behavioral Indicators: A file conversion service behavior where an archive uploaded without sensitive internal files results in converted file containing internal system files after conversion.
## Associated Threat Actors
- This technique is generally classified as an application abuse flaw rather than actor-specific malware, though it could be used by any threat actor with web application testing or exploitation knowledge.
## Detection Methods
- **Signature-based detection:** Signature deployment on the conversion service itself to identify archives containing embedded symlink references pointing to sensitive paths (`/etc/passwd`, `/proc/*`).
- **Behavioral detection:** Monitoring the file extraction/conversion process on the server to determine if linked files are being dereferenced and added to the output archive instead of preserving the link structure (or failing the conversion safely).
- **YARA rules:** Difficult to apply directly to the web service interaction, but could be used to analyze uploaded ZIP files for suspicious symlink patterns if the service performs pre-analysis.
## Mitigation Strategies
- **Input Validation/Sanitization:** Thoroughly inspect archive contents *before* extraction or conversion, specifically checking for symbolic links.
- **Safe Extraction Environment:** If archives must be processed server-side, ensure the extraction utility is configured to ignore or safely discard symbolic links.
- **Containerization Isolation:** The article notes the provider mitigated this by running conversions in newly spawned Docker containers, limiting the scope of file system interaction available to the conversion process. Ensure file system access is strictly jailed (chroot/containerization) and does not permit traversing above the intended working directory.
- **Format Conversion Rigor:** Explicitly configure the conversion tool to handle link formats according to secure standards for the *target* format, or reject files containing links if the target format does not support them securely.
## Related Tools/Techniques
- Archive-based exploit delivery mechanisms.
- Exploiting application parsing logic (e.g., vulnerabilities in custom file parsers).
- Techniques leveraging format feature inconsistencies between source and target files.