There are numerous ways of concealing sensitive data and code within malicious files and programs. However, attackers use one particular obfuscation technique very frequently because it is simple to implement and offers protection that’s usually sufficient. This approach works like this:
The attacker picks a 1-byte value to act as the key. The possible key values range from 0 to 255 (in decimal).
The attacker’s code iterates through every byte of the data that needs to be encoded, XOR’ing each byte with the selected key.
To deobfuscate the protected string, the attacker’s code repeats step #2, this time XOR’ing each byte in the encoded string with the key value.
For example, consider themalicious Microsoft Word document World Uyghur Congress Invitation.doc, which was submitted to victims as an email attachment in a targeted attack. (To understand how this exploit works, see my earlier postsHow Malicious Code Can Run in Microsoft Office Documents andHow to Extract Flash Objects From Malicious MS Office Documents.)
To locate the executable file within the Word document, you can use Frank Boldewin’s OfficeMalScanner tool. The scan option directs the tool to look for the embedded malicious Office and Windows executable files. The brute option tells the tool to look for these artifacts even if they were obfuscated using several common methods, including the XOR technique described above.
In this example, OfficeMalScanner automatically locates and extracts the embedded Windows executable, saving it as the WUC Invitation Letter Guests__PEFILE__OFFSET=0xfc10__XOR-KEY=0x70.bin. (The tool automatically determined that the attacker used XOR key 0x70 to conceal this file.) According to PEiD (see screenshot below), the extracted file is a Win32 program that is not packed and that was probably compiled using Microsoft Visual C++.
The deobfuscated and extracted Windows executable file can be analyzed using any means, including your favorite disassembler and debugger, as well as using behavioral analysis techniques.
It’s quite possible that the extracted malicious executable also contains obfuscated data. Given that everyone, including malware authors, takes shortcuts once in a while, it’s possible that this data is protected using the simple XOR algorithm we discussed earlier. Didier Steven’s XORSeach tool can scan any file, looking for strings encoded using simple techniques, including this XOR method.
You need to know the clear-text version of the string you’d like XORSearch to locate. One good value to look for is http, because attackers often wish to conceal URLs within malicious code. Another good string, as suggested by Marfi, might be This program, because that might identify an embedded and XOR-encoded Windows executable, which typically has the string This program cannot be run in DOS mode in the DOS portion of the PE header.
As you can see below, XORSearch locates the string HTTP/1.1 apparently it was encoded using the key 1B. (Sometimes you get a false positive, as seems to be the case with the key 3B.)
When invoking XORSearch with the -s parameter, you direct the tool to attempt decoding all strings within the file using the discovered key. In our example, this results in the creation of the WUC Invitation Letter Guests__PEFILE__OFFSET=0xfc10__XOR-KEY=0x70.bin.XOR.1B file. If you look at this file using a hex editor, you can locate several decoded strings that you might use as the basis for custom signatures and further code-level analysis.
XOR and related methods are often used by attackers to obfuscate code and data. The tools above help you locate, decode and extract these concealed artifacts. If you have recommendations for other tools that can help with such tasks, please let us know by emailor leave a comment below.
— Lenny Zeltser
Lenny Zeltser focuses on safeguarding customers’ IT operations at NCR Corp. He also teaches how to analyze malware at SANS Institute. Lenny is active on Twitter and writes a security blog.
(c) SANS Internet Storm Center. http://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.