Deep Analysis of GCleaner

Full Report

Howdy! I’m finally back with another malware deep dive report. This time we are digging into GCleaner. GCleaner is a Pay-Per-Install (PPI) loader first discovered in early 2019, it has been used to deploy other malicious families like Smokeloader, Amadey, Redline and Raccoon. We will be working on this sample: (SHA256: 020d370b51711b0814901d7cc32d8251affcc3506b9b4c15db659f3dbb6a2e6b) Initial Triage Let’s start by running the sample in Triage sandbox to get an overview of what it does. We can see from the process tree that it drops and runs another binary out of "%APPDATA%" folder with a seemingly random name then it kills itself using "taskkill" and deletes the sample binary from disk. The network tab shows communications to different IP addresses which are considered as C2 servers in Triage’s malware config tab. Each C2 has a different URL path, we will dig deeper to find out what each of them is responsible for. Right when we open the sample in IDA we don’t have much to look at, there are some interesting strings and API imports but not very helpful to start with. We can see a repeated pattern across the code where some values are pushed into the stack then xored with 0x2E, so we first need to decrypt these values. String Decryption Automating the decryption for stack strings in this sample can be a bit tricky, luckily I noticed a specific instruction that occurs after loading the encrypted strings into stack (cmp eax, [reg+4]). So we can find all occurrences of this instruction then walk back to find the mov instructions and get the encrypted values. Let’s apply this to an IDA python script. # Lowest address used in the program addr = idc.get_inf_attr(INF_MIN_EA) while True: # Search for "cmp eax, [reg+4]" addr = ida_search.find_binary(addr, idc.BADADDR, "3B ?? 04 00 00 00", 16, ida_search.SEARCH_NEXT | ida_search.SEARCH_DOWN) if addr == idc.BADADDR: break enc_bytes = b'' # Search for possible stack strings in the previous 12 instructions for i in range(12): ea = idc.prev_head(ea) if (idc.print_insn_mnem(ea) == "mov" and idc.get_operand_type(ea, 0) == idc.o_displ and idc.get_operand_type(ea, 1) == idc.o_imm): # Get the value of the second operand operand_value = idc.get_operand_value(ea, 1) The returned operand value is an integer but we need to store it as a byte array, so we first need to figure out the size of that operand to store it correctly. # Get the size of the second operand insn = ida_ua.insn_t() ida_ua.decode_insn(insn, ea) operand_size = ida_ua.get_dtype_size(insn.Op2.dtype) # Specify the correct data type if operand_size == 4: operand_bytes = struct.pack(" {dec_str}") # Set a comment with the decrypted string if dec_str and comment_addr != idc.BADADDR: set_comment(comment_addr, dec_str) Here is the list of decrypted strings: Expand to see more 45.12.253.56 45.12.253.72 45.12.253.98 45.12.253.75/dll.php mixinte mixtwo B USERPROFILE CCleaner VLC media player Acrobat Reader DC Russian admin Shah testBench taskmgr Taskmgr wireshark Process Hacker Wireshark C:\Program Files C:\ProgramData C:\Temp C:\Program Files C:\ProgramData C:\Temp /advertisting/plus.php?s= &str=mixtwo &substr= /default/stuk.php /default/puk.php NOSUB chk /chk test We can now see the C2 IPs, URL paths and some other interesting strings. Let’s keep going. Anti Checks (or is it..?) GCleaner is filled with host checks but weirdly enough it doesn’t do anything them, maybe they were like test features? copy-paste code? not really sure but let’s quickly go though them. Checking username Get the current username using "GetUserNameA()" and compare it to hardcoded names ("admin", "Shah", "testBench"). Checking foreground window Get the title of the foreground window using "GetWindowTextA()" and compare it to hardcoded strings. Checking desktop files Search for Desktop files with specific strings in their name ("CCleaner", "VLC media player", "Acrobat Reader DC"). Checking locale and keyboard layout Check if the computer locale is Russian and compare the keyboard layout against specific values (CIS countries). Dropped Binary Looking back at the process tree we need to figure out where does that child binary with random name comes from. "%APPDATA%\{846ee340-7039-11de-9d20-806e6f6e6963}\34LMAylZs6FixF.exe" We can see below that the sample reads the "%APPDATA%" path using "getenv()" then creates a random directory using the GUID of the current hardware profile, if retrieving the hardware profile failed it will fall back to generating a random folder name. Other possible locations for creating the random directory are "C:\Program Files", "C:\Temp", "C:\ProgramData" (fallback locations). Next it generates a random file name, appends ".exe" extension to it then drops it to the newly created directory and runs it from there. The binary file is hardcoded into the parent sample. All that binary child does is…well…sleep for 10 seconds, that’s it :| C2 Communications The actors behind GCleaner have been known to use BraZZZers fast flux service to hide their infrastructure, it works more like a proxy system between the victims and the real C2 server. Before reaching out to the C2 servers, GCleaner adds hardcoded HTTP headers (could be used for a network sig) an a custom user-agent to each C2 request. Now to figure out what each C2 request is responsible for. First C2 IP: 45[.]12.253.56 UA: OK PCAP: This C2 is likely responsible for bot registration. The sample will only continue execution if the server response is "0" or "1", otherwise it goes to sleep and tries again. The "str" and "substr" parameters in the C2 request above are possibly referring to the campaign ID, GCleaner has been known to use similar values in the past like "usone", "ustwo", "euthree", "cafive", "mixshop", … Second C2 IP: 45[.]12.253.72 UA: OK PCAP: The first request to this C2 is responsible for getting an AES key. The key length must be between 10 and 100 bytes, otherwise it breaks the execution. The second request is responsible for getting an AES encrypted PE file (notice the filename in the response headers!), That PE file is decrypted using the key from the previous request. The decryption routine is pretty trivial, the sample first calculates the SHA256 hash of the server key then derives the session key used for decryption (AES_128). After that it loads the decrypted PE file into memory (without touching disk) to get the address of an export function called "GetLicInfo" which is used in the next stage. Downloaded DLL Before going further we first need to take a look at the downloaded PE file. To be able to analyze it we can either use the debugger to dump the decrypted file or get the encrypted response from the PCAP and decrypt it manually. We can easily implement the decryption code in Python as follow: import hashlib from Crypto.Cipher import AES enc = open("puk.php.bin", "rb").read() key = "kvQoRqtcCyMtHmQyQXOUu".encode("utf-16le") # Important to encode!! sha256_hash = hashlib.sha256(key) aes_key = sha256_hash.digest()[:16] cipher = AES.new(aes_key, mode=AES.MODE_CBC, IV=b"\x00"*16) dec = cipher.decrypt(enc) open("out.bin", "wb").write(dec) Now let’s see what this export function "GetLicInfo" does. Basically it sends an http request to the supplied C2 server then checks the response length, if the length is greater than 2048 bytes it creates a a new directory with a random name under "%APPDATA%" or "%TEMP%" folder then generates a random filename and appends ".exe" extension to it. Finally it writes the server response to a disk file with the generated random filename and executes that file. Third C2 IP: 45[.]12.253.75 UA: B PCAP: This C2 is responsible for downloading further payloads, notice the user-agent used here is the one from the decrypted strings list unlike the previous 2 C2s. The address is supplied to the external function "GetLicInfo" which downloads and executes the payload as we stated above. GCleaner tries to get a payload from the server for 10 iterations with a sleep period of 2 seconds between every try. If no further payload is received from the server the samples kills its process and deletes the parent file from disk. Forth C2 IP: 45[.]12.253.98 This C2 wasn’t used in the sample we are looking at. Config Extraction We can use the IDA python script we used for string decryption to build a standalone config extractor as most of the interesting stuff are in the decrypted strings list. Here’s the output of the code after extracting the useful information: The code can be found here. (this script is not optimized for production, it’s just for research purposes) Hunting Urlscan The URL path of the first C2 request can be a good candidate to hunt for more C2s on urlscan. I looked at more samples and found these two URL patterns: s=NOSUB&str=...&substr=... sub=NOSUB&stream=...&substream=... So we can use the "page.url" field to search for the first part of these patterns. Yara We saw that many strings were encrypted but we can use some of the hardcoded ones to create a simple yara rule for hunting more samples. rule GCleaner { meta: description = "Detects GCleaner payload" author = "Abdallah Elshinbary (@_n1ghtw0lf)" hash1 = "020d370b51711b0814901d7cc32d8251affcc3506b9b4c15db659f3dbb6a2e6b" hash2 = "73ed1926e850a9a076a8078932e76e1ac5f109581996dd007f00681ae4024baa" strings: // Kill self $s1 = "\" & exit" ascii fullword $s2 = "\" /f & erase " ascii fullword $s3 = "/c taskkill /im \"" ascii fullword // Anti checks $s4 = " Far " ascii fullword $s5 = "roxifier" ascii fullword $s6 = "HTTP Analyzer" ascii fullword $s7 = "Wireshark" ascii fullword $s8 = "NetworkMiner" ascii fullword // HTTP headers $s9 = "Accept-Language: ru-RU,ru;q=0.9,en;q=0.8" ascii fullword $s10 = "Accept-Charset: iso-8859-1, utf-8, utf-16, *;q=0.1" ascii fullword $s11 = "Accept-Encoding: deflate, gzip, x-gzip, identity, *;q=0" ascii fullword $s12 = "Accept: text/html, application/xml;q=0.9, application/xhtml+xml, image/png, image/jpeg, image/gif, image/x-xbitmap, */*;q=0.1" ascii fullword condition: uint16(0) == 0x5a4d and 10 of them } References https://medium.com/csis-techblog/gcleaner-garbage-provider-since-2019-2708e7c87a8a https://medium.com/csis-techblog/inside-view-of-brazzzersff-infrastructure-89b9188fd145

Analysis Summary