Full Report
Javier had a simple shell script he posted to our internal chat a few days ago. It’s goal was to pull all the IP ranges for a country in preparation for a footprint from https://ipinfo.io/ (Let’s use PL as an example). Given this involved pulling multiple webpages, I was interested to know what the most efficient approach to this in the shell would be. Truthfully, the actual problem, pulling data from the site or gathering BGP routes, didn’t interest me, I wanted to look at how to do mass HTTP enum most efficiently with curl.
Analysis Summary
# Tool/Technique: curl / Shell Scripting for Mass HTTP Enumeration
## Overview
This summary focuses on the efficiency techniques used in shell scripting, specifically leveraging the `curl` command and utilities like `xargs`, to perform mass HTTP enumeration efficiently. The context is optimizing a script designed to pull IP ranges for a country from `ipinfo.io` by comparing sequential, parallel, pipelined, and hybrid approaches.
## Technical Details
- Type: Tool / Technique
- Platform: Unix-like systems (Shell environments like Bash)
- Capabilities: Executing multiple HTTP requests; data retrieval; chaining commands via pipes (`|`).
- First Seen: Not specified (The article describes experimentation with existing tools in 2018).
## MITRE ATT&CK Mapping
The techniques described relate primarily to reconnaissance and gathering information needed for subsequent actions, utilizing common system utilities.
- **TA0043 - Reconnaissance**
- **T1598 - Phishing for Information** (Broadly related to gathering target data via network requests)
- T1598.003 - Email Phishing (Not directly applicable, but highlights information gathering intent)
- **T1595 - Active Scanning** (While internal, iterating through URLs is a form of active probing)
- T1595.002 - Internet Service Scanning
## Functionality
### Core Capabilities
The primary function analyzed is the efficient execution of numerous HTTP GET requests to retrieve data from a remote source (`ipinfo.io`).
1. **Sequential Fetching (Baseline):** Using `seq`, piping to `xargs -I% curl`, fetching URLs one after the other, heavily limited by network latency and sequential processing.
2. **Parallel Processing:** Utilizing `xargs -P <N>` to execute HTTP requests simultaneously across multiple processes, significantly reducing execution time by overlapping network I/O wait states.
3. **HTTP Pipelining:** Employing `curl --config` to instruct `curl` to reuse a single TCP connection for multiple requests (pipelining), drastically reducing TCP handshake overhead, leading to the fastest performance in the tests observed.
4. **Data Filtering:** Using `grep` (specifically with regex like `AS[0-9]{1,9}` and `[0-9\.]{7,15}\/[0-9]{1,2}`) to extract specific data points (AS numbers, CIDR ranges) from the downloaded content.
5. **Data Ordering:** Using `sort -u` to ensure uniqueness and ordering of collected identifiers.
### Advanced Features
- **Hybrid Approach:** Combining concurrency (`xargs -P`) for the initial, smaller set of requests with pipelining (`curl -L`) for the subsequent large volume of requests, aiming for optimized throughput by balancing resource management.
- **Efficiency Measurement:** Using system utilities like `time`, `tshark` (for network statistics), and analyzing frame counts and data volume to quantify procedural improvements.
## Indicators of Compromise
Since the article focuses on performance tuning of legitimate system tools (`curl`, `xargs`, `grep`), there are no inherent malware IoCs. The indicators documented are related to the experimental setup:
- File Hashes: N/A
- File Names: N/A
- Registry Keys: N/A
- Network Indicators: Experimental internal server used: `http://127.0.0.1:8080/`
- Behavioral Indicators: High volume of concurrent or pipelined outbound HTTP connections initiated by a shell process.
## Associated Threat Actors
N/A (This analysis describes performance tuning of standard system utilities, not a specific threat actor's custom tool.)
## Detection Methods
Detection focuses on identifying overly aggressive or unusual usage patterns of standard utilities, often associated with automated reconnaissance or data collection.
- Signature-based detection: Rules flagging excessive use of `curl` chained with `xargs -P` or massive subsequent redirects/data pulls.
- Behavioral detection: Monitoring for a single shell process spawning numerous `curl` instances quickly, atypical for standard user operations.
- YARA rules: N/A
## Mitigation Strategies
The mitigation strategies focus on process control and network monitoring rather than blocking the specific tools themselves.
- **Process Limitation/Limiting Concurrency:** Restricting the capabilities of non-privileged users to spawn excessive child processes (limiting `xargs -P` effectively).
- **Rate Limiting and Throttling:** Implementing network controls to limit the request rate from internal or external sources targeting specific services (like `ipinfo.io`).
- **Network Monitoring:** Establishing baselines for request frequency and connection counts to detect anomalous mass enumeration activity.
## Related Tools/Techniques
- **`parallel`:** Mentioned as an alternative to `xargs -P`, although it performed slower in the specific tests cited.
- **`wget`:** Another common tool for recursive or mass downloads, often compared against `curl`.
- **Shell Scripting:** The foundational technique for automation.