Full Report
Howdy! I’m finally back with another malware deep dive report. This time we are digging into GCleaner. GCleaner is a Pay-Per-Install (PPI) loader first discovered in early 2019, it has been used to deploy other malicious families like Smokeloader, Amadey, Redline and Raccoon. We will be working on this sample: (SHA256: 020d370b51711b0814901d7cc32d8251affcc3506b9b4c15db659f3dbb6a2e6b) Initial Triage Let’s start by running the sample in Triage sandbox to get an overview of what it does. We can see from the process tree that it drops and runs another binary out of "%APPDATA%" folder with a seemingly random name then it kills itself using "taskkill" and deletes the sample binary from disk. The network tab shows communications to different IP addresses which are considered as C2 servers in Triage’s malware config tab. Each C2 has a different URL path, we will dig deeper to find out what each of them is responsible for. Right when we open the sample in IDA we don’t have much to look at, there are some interesting strings and API imports but not very helpful to start with. We can see a repeated pattern across the code where some values are pushed into the stack then xored with 0x2E, so we first need to decrypt these values. String Decryption Automating the decryption for stack strings in this sample can be a bit tricky, luckily I noticed a specific instruction that occurs after loading the encrypted strings into stack (cmp eax, [reg+4]). So we can find all occurrences of this instruction then walk back to find the mov instructions and get the encrypted values. Let’s apply this to an IDA python script. # Lowest address used in the program addr = idc.get_inf_attr(INF_MIN_EA) while True: # Search for "cmp eax, [reg+4]" addr = ida_search.find_binary(addr, idc.BADADDR, "3B ?? 04 00 00 00", 16, ida_search.SEARCH_NEXT | ida_search.SEARCH_DOWN) if addr == idc.BADADDR: break enc_bytes = b'' # Search for possible stack strings in the previous 12 instructions for i in range(12): ea = idc.prev_head(ea) if (idc.print_insn_mnem(ea) == "mov" and idc.get_operand_type(ea, 0) == idc.o_displ and idc.get_operand_type(ea, 1) == idc.o_imm): # Get the value of the second operand operand_value = idc.get_operand_value(ea, 1) The returned operand value is an integer but we need to store it as a byte array, so we first need to figure out the size of that operand to store it correctly. # Get the size of the second operand insn = ida_ua.insn_t() ida_ua.decode_insn(insn, ea) operand_size = ida_ua.get_dtype_size(insn.Op2.dtype) # Specify the correct data type if operand_size == 4: operand_bytes = struct.pack(" {dec_str}") # Set a comment with the decrypted string if dec_str and comment_addr != idc.BADADDR: set_comment(comment_addr, dec_str) Here is the list of decrypted strings: Expand to see more 45.12.253.56 45.12.253.72 45.12.253.98 45.12.253.75/dll.php mixinte mixtwo B USERPROFILE CCleaner VLC media player Acrobat Reader DC Russian admin Shah testBench taskmgr Taskmgr wireshark Process Hacker Wireshark C:\Program Files C:\ProgramData C:\Temp C:\Program Files C:\ProgramData C:\Temp /advertisting/plus.php?s= &str=mixtwo &substr= /default/stuk.php /default/puk.php NOSUB chk /chk test We can now see the C2 IPs, URL paths and some other interesting strings. Let’s keep going. Anti Checks (or is it..?) GCleaner is filled with host checks but weirdly enough it doesn’t do anything them, maybe they were like test features? copy-paste code? not really sure but let’s quickly go though them. Checking username Get the current username using "GetUserNameA()" and compare it to hardcoded names ("admin", "Shah", "testBench"). Checking foreground window Get the title of the foreground window using "GetWindowTextA()" and compare it to hardcoded strings. Checking desktop files Search for Desktop files with specific strings in their name ("CCleaner", "VLC media player", "Acrobat Reader DC"). Checking locale and keyboard layout Check if the computer locale is Russian and compare the keyboard layout against specific values (CIS countries). Dropped Binary Looking back at the process tree we need to figure out where does that child binary with random name comes from. "%APPDATA%\{846ee340-7039-11de-9d20-806e6f6e6963}\34LMAylZs6FixF.exe" We can see below that the sample reads the "%APPDATA%" path using "getenv()" then creates a random directory using the GUID of the current hardware profile, if retrieving the hardware profile failed it will fall back to generating a random folder name. Other possible locations for creating the random directory are "C:\Program Files", "C:\Temp", "C:\ProgramData" (fallback locations). Next it generates a random file name, appends ".exe" extension to it then drops it to the newly created directory and runs it from there. The binary file is hardcoded into the parent sample. All that binary child does is…well…sleep for 10 seconds, that’s it :| C2 Communications The actors behind GCleaner have been known to use BraZZZers fast flux service to hide their infrastructure, it works more like a proxy system between the victims and the real C2 server. Before reaching out to the C2 servers, GCleaner adds hardcoded HTTP headers (could be used for a network sig) an a custom user-agent to each C2 request. Now to figure out what each C2 request is responsible for. First C2 IP: 45[.]12.253.56 UA: OK PCAP: This C2 is likely responsible for bot registration. The sample will only continue execution if the server response is "0" or "1", otherwise it goes to sleep and tries again. The "str" and "substr" parameters in the C2 request above are possibly referring to the campaign ID, GCleaner has been known to use similar values in the past like "usone", "ustwo", "euthree", "cafive", "mixshop", … Second C2 IP: 45[.]12.253.72 UA: OK PCAP: The first request to this C2 is responsible for getting an AES key. The key length must be between 10 and 100 bytes, otherwise it breaks the execution. The second request is responsible for getting an AES encrypted PE file (notice the filename in the response headers!), That PE file is decrypted using the key from the previous request. The decryption routine is pretty trivial, the sample first calculates the SHA256 hash of the server key then derives the session key used for decryption (AES_128). After that it loads the decrypted PE file into memory (without touching disk) to get the address of an export function called "GetLicInfo" which is used in the next stage. Downloaded DLL Before going further we first need to take a look at the downloaded PE file. To be able to analyze it we can either use the debugger to dump the decrypted file or get the encrypted response from the PCAP and decrypt it manually. We can easily implement the decryption code in Python as follow: import hashlib from Crypto.Cipher import AES enc = open("puk.php.bin", "rb").read() key = "kvQoRqtcCyMtHmQyQXOUu".encode("utf-16le") # Important to encode!! sha256_hash = hashlib.sha256(key) aes_key = sha256_hash.digest()[:16] cipher = AES.new(aes_key, mode=AES.MODE_CBC, IV=b"\x00"*16) dec = cipher.decrypt(enc) open("out.bin", "wb").write(dec) Now let’s see what this export function "GetLicInfo" does. Basically it sends an http request to the supplied C2 server then checks the response length, if the length is greater than 2048 bytes it creates a a new directory with a random name under "%APPDATA%" or "%TEMP%" folder then generates a random filename and appends ".exe" extension to it. Finally it writes the server response to a disk file with the generated random filename and executes that file. Third C2 IP: 45[.]12.253.75 UA: B PCAP: This C2 is responsible for downloading further payloads, notice the user-agent used here is the one from the decrypted strings list unlike the previous 2 C2s. The address is supplied to the external function "GetLicInfo" which downloads and executes the payload as we stated above. GCleaner tries to get a payload from the server for 10 iterations with a sleep period of 2 seconds between every try. If no further payload is received from the server the samples kills its process and deletes the parent file from disk. Forth C2 IP: 45[.]12.253.98 This C2 wasn’t used in the sample we are looking at. Config Extraction We can use the IDA python script we used for string decryption to build a standalone config extractor as most of the interesting stuff are in the decrypted strings list. Here’s the output of the code after extracting the useful information: The code can be found here. (this script is not optimized for production, it’s just for research purposes) Hunting Urlscan The URL path of the first C2 request can be a good candidate to hunt for more C2s on urlscan. I looked at more samples and found these two URL patterns: s=NOSUB&str=...&substr=... sub=NOSUB&stream=...&substream=... So we can use the "page.url" field to search for the first part of these patterns. Yara We saw that many strings were encrypted but we can use some of the hardcoded ones to create a simple yara rule for hunting more samples. rule GCleaner { meta: description = "Detects GCleaner payload" author = "Abdallah Elshinbary (@_n1ghtw0lf)" hash1 = "020d370b51711b0814901d7cc32d8251affcc3506b9b4c15db659f3dbb6a2e6b" hash2 = "73ed1926e850a9a076a8078932e76e1ac5f109581996dd007f00681ae4024baa" strings: // Kill self $s1 = "\" & exit" ascii fullword $s2 = "\" /f & erase " ascii fullword $s3 = "/c taskkill /im \"" ascii fullword // Anti checks $s4 = " Far " ascii fullword $s5 = "roxifier" ascii fullword $s6 = "HTTP Analyzer" ascii fullword $s7 = "Wireshark" ascii fullword $s8 = "NetworkMiner" ascii fullword // HTTP headers $s9 = "Accept-Language: ru-RU,ru;q=0.9,en;q=0.8" ascii fullword $s10 = "Accept-Charset: iso-8859-1, utf-8, utf-16, *;q=0.1" ascii fullword $s11 = "Accept-Encoding: deflate, gzip, x-gzip, identity, *;q=0" ascii fullword $s12 = "Accept: text/html, application/xml;q=0.9, application/xhtml+xml, image/png, image/jpeg, image/gif, image/x-xbitmap, */*;q=0.1" ascii fullword condition: uint16(0) == 0x5a4d and 10 of them } References https://medium.com/csis-techblog/gcleaner-garbage-provider-since-2019-2708e7c87a8a https://medium.com/csis-techblog/inside-view-of-brazzzersff-infrastructure-89b9188fd145
Analysis Summary
# Tool/Technique: GCleaner Payload Loader
## Overview
GCleaner is a Pay-Per-Install (PPI) loader first observed in early 2019. Its primary function is to execute initial stages that contact Command and Control (C2) servers to download and execute subsequent malicious payloads, often deploying other malware families such as Smokeloader, Amadey, Redline, and Raccoon.
## Technical Details
- Type: Malware family (Loader)
- Platform: Windows
- Capabilities: Payload delivery, C2 communication, string decryption, process manipulation, dynamic file loading.
- First Seen: Early 2019
## MITRE ATT&CK Mapping
- TA0002 - Execution
- T1059.003 - Command and Scripting Interpreter: Windows Command Shell
- TA0005 - Defense Evasion
- T1027 - Obfuscated Files or Information
- T1070.004 - Indicator Removal: File Deletion
- TA0011 - Command and Control
- T1071.001 - Application Layer Protocol: Web Protocols
## Functionality
### Core Capabilities
* **Initial Execution & Cleanup:** Drops and executes a secondary, seemingly inert binary from the `%APPDATA%` path (or fallback locations like `C:\Program Files`, `C:\Temp`, `C:\ProgramData`), then self-deletes using `taskkill` and file erasure.
* **String Decryption:** Utilizes an XOR encryption scheme (key 0x2E) for obfuscating critical strings, requiring reverse engineering techniques (like analyzing `cmp eax, [reg+4]`) to reveal configuration data.
* **Host Environment Checks (Dormant/Test features):** Performs checks against usernames (`admin`, `Shah`, `testBench`), foreground window titles, specific desktop files (`CCleaner`, `VLC media player`), and system locale/keyboard layout (specifically checking for Russian/CIS environments).
* **Payload Dropping Mechanism:** Creates random directories, often utilizing the hardware profile GUID, within standard user directories to stage secondary payloads.
### Advanced Features
* **Multi-Stage C2 Communication:** Employs a structured sequence of C2 interactions:
1. **C2 IP 1 (Registration/Check-in):** Contacts servers (e.g., 45[.]12[.]253[.]56) using campaign identifiers (`str`, `substr` parameters). Execution continues only if the response is "0" or "1".
2. **C2 IP 2 (Key and Payload Download):** Retrieves an AES key (10-100 bytes long). Uses this key to decrypt a subsequent AES-encrypted PE file loaded directly into memory (reflective loading). The session key for AES-128 decryption is derived from the SHA256 hash of the server key.
3. **In-Memory Execution:** Calls an export function (`GetLicInfo`) from the decrypted PE file.
4. **C2 IP 3 (Final Payload Download):** The function retrieved from memory instructs the main loader to contact a third C2 (e.g., 45[.]12[.]253[.]75/dll.php) to download the final executable payload, which is then written to disk and executed from a temporary directory.
* **Infrastructure Concealment:** Infrastructure is reportedly hidden using fast-flux services (specifically mentioning BraZZZers fast flux).
* **Custom HTTP Signaling:** Uses hardcoded HTTP headers and a custom User-Agent for C2 communications, which can be used for network signatures.
## Indicators of Compromise
- File Hashes:
- SHA256: `020d370b51711b0814901d7cc32d8251affcc3506b9b4c15db659f3dbb6a2e6b` (Sample analyzed)
- File Names:
- Drops child binaries with random names and a `.exe` extension, often in directories derived from hardware GUIDs in `%APPDATA%`.
- Registry Keys: (Not explicitly mentioned, but persistence/location is in common application data paths).
- Network Indicators:
- C2 IP: `45[.]12[.]253[.]56` (Registration)
- C2 IP: `45[.]12[.]253[.]72` (AES Key/PE Download)
- C2 IP: `45[.]12[.]253[.]75` (Final Payload Download)
- C2 IP: `45[.]12[.]253[.]98` (Unused in this sample)
- URL Paths/Parameters: `/advertisting/plus.php?s=`, `/default/stuk.php`, `/default/puk.php`, parameters like `s=NOSUB&str=...&substr=...` or `sub=NOSUB&stream=...&substream=...`
- Behavioral Indicators:
- Process termination via `taskkill` followed by self-deletion.
- Reading `%APPDATA%` via `getenv()`.
- In-memory decryption and execution of a PE file retrieved over HTTP.
## Associated Threat Actors
The report notes GCleaner distributes known malware families like Smokeloader, Amadey, Redline, and Raccoon, suggesting it acts as a generic initial access broker/installer for various threat operations rather than being attributed to a single known APT group based solely on this analysis.
## Detection Methods
- **Signature-based Detection:** Based on hardcoded, unencrypted strings related to self-termination commands (`taskkill /f & erase`) or custom HTTP headers (e.g., `Accept-Language: ru-RU,ru;q=0.9,en;q=0.8`).
- **Behavioral Detection:** Monitoring for processes that terminate themselves immediately after spawning a child process in `%APPDATA%`. Detecting network activity matching the multi-stage command structure.
- **YARA Rules:** A sample rule is provided leveraging hardcoded strings for detection.
## Mitigation Strategies
- Implement high-fidelity network-based detection rules targeting the identified C2 infrastructures and custom HTTP headers/User-Agents.
- Implement application control to restrict execution from temporary or application data directories (`%APPDATA%`, `%TEMP%`).
- Utilize advanced endpoint detection focusing on anomalous process injection or in-memory loading of executable content.
## Related Tools/Techniques
* **Smokeloader, Amadey, Redline, Raccoon:** Subsequent malware families deployed by GCleaner.
* **BraZZZers fast flux service:** Infrastructure technique used by operators to hide C2 locations.