Bear’s election campaign – (3/4)

Now the fun begins. Using static analysis of the code we have extracted the two overtly malicious files which are stored as %APPDATA%\\Skype\\hqwhbr.lck” and the ADS PNG residing in the same directory %APPDATA%\\Skype\\hqwhbr.lck:schemas”.

The corresponding hashes I have for the files I carved out of the LNK are 57c627d68e156676d08bfc0829b94331 and cbf96820dc74a50a91b2b8b94376682a which are a match for the Volexity blog so that is a good sign. As I noted before, if you decide to jump in via dynamic analysis to get the two files you will not see the :schemas unless you view the directory in the command prompt and use the /R flag. Alternate Data Streams was created for compatibility with other OS’s, but the only time I have come across is it is for malware or forensics training.

Now that I have the files to look at, I usually start off by running FireEye’s excellent FLOSS tool which does the basic strings, but also does a few levels of un-encoding if it is using simple routines. Running the tool with the -i flag also allows you to have a python script to run in IDA which will show the strings which are decoded. There is also support for RADARE if you are using that. Looking at the output from FLARE we see some interesting strings which lead us to believe that the backdoor is using HTTP/HTTPS for command and control or exfill. Not a big shocker, I’m not aware of much for malware these days that doesn’t fly right by your egress firewall rules by using 80/443.

For the sake of brevity, I will just paste the strings I found of interest. Inside the encoded function located at 0x402937 are browser strings, and some interesting ones which could potentially be related to the suspected TEA algorithm which we assume the PNG is saved as. We also see what I was hoping to find, “:schemas” referenced inside this function which will hopefully save us some time in locating the decoding routine for the PNG. Of note we also see some registry related paths and a reference to IAStorIcon which Volexity mentioned is the location where it achieves persistence in the registry. Three cheers to the girls and guys on the Flare team for creating this tool and open sourcing it to the community.


λ C:\Tools\floss-1.3.0\floss.exe C:\Users\thomas\Desktop\hqwhbr.lck -g -i floss.py

FLOSS decoded 48 strings

Decoding function at 0x402937 (decoded 48 strings)

Mozilla\Firefox\Profile
user_pref("network.proxy.http_port",
Use HTTP=
HTTP server=
LoadLibraryA
VirtualAlloc
VirtualFree
GetProcessHeap
HeapAlloc
SUVW1
_^][
default
prefs.js
Opera\Opera
operaprefs.ini
VWS1
u#j@h
\IAStorIcon
SOFTWARE\\Microsoft
note_window_management_procedure
accepted-
1.1.1.1
Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
:schemas

SOFTWARE\Microsoft\Windows\CurrentVersion\Run
rundll32 "%s", #2

FLOSS extracted 4 stackstrings
VAadvapi32.dll
winhttp.dll
shell32.dll
AAA:
Wrote IDAPython script file to C:\Users\thomas\floss.py

Loading the .lck DLL into IDA and then running the floss.py script against it works with just a few stack string errors which we won’t worry about. We can see from the FLOSS output that the tool has identified 0x402937 as the location of the decoding routine for the strings. Pulling up the graph for the function we can see what is going on with the xor loop.
strings_decode

The corresponding C pseudo code for this XOR routine via the HexRays Decompiler:

char __stdcall sub_402937(int a1, int a2)
{
  _BYTE *v2; // esi@1
  int v3; // edi@1
  int v4; // ecx@1
  char result; // al@2

  v2 = (_BYTE *)(a2 - 1);
  v3 = a2 - 1;
  v4 = a2 - a1;
  do
  {
    result = *(_BYTE *)(v3 - 2) ^ *(_BYTE *)(v3 - 1) ^ *v2;
    *v2-- = result;
    --v3;
    --v4;
  }
  while ( v4 );
  return result;
}

Alrighty, so we have determined how FLARE got the decoded strings, but we still need to apply the decoding routine to the whole block of code, not just guessing at functionality from a few decoded strings. Who knows what else is in there if we don’t decode it? Now that we figured out the XOR decoding routine function, we can look at XREF’s to see what calls this function and what arguments are passed to it. From the pseudocode we can tell that we are looking for two arguments (int a1, int a2) which are most likely the start and end of the section to decode.

Right clicking on our function name and clicking on “xrefs to XOR_routine” we can see that two other functions call on it. Looking at the cldsys_2 function we can see two arguments being pushed onto the stack to be passed to the routine.

arguments

As we predicted we are seeing two memory locations being passed to the XOR routine, and the do loop counts down with the total loop count being the difference between the memory addresses. So we can expect the count to run (0x401DA3-0x40146A) or (0x939) times which is the length of the encoded section. Next I need to carve out the section which I will use with a custom python script to decode. Something that stumped me for a minute and a coworker helped me with was to find that location in a hex editor. Remember the .lck file is a DLL so it is being loaded into a different address space inside IDA versus just viewing it in a hex editor. That being said we know the difference between the start and end will be the same, it is just the offset that differs. As a quick and easy workaround I did the following. (Note: there has to be an easy way to do this inside IDA, but I don’t know how and the hex editor native to IDA is horrible…) Inside IDA go to the first memory location (0x40146A) and take a look at all the encoded glory.

encoded

Now I am going to copy the first row which should be enough to get me a unique string to search for inside my hex editor of choice (Personal favorite is 010).

Inside IDA loaded through a DLL at 0x40146A…
encoded

After searching for the corresponding string inside 010 we find the same location to be at 0x86A.
encoded2

So now we know that the offset is 0x400C00, but we just want to carve the difference between the start and stop which doesn’t vary between the programs. Inside 010 we do a “select range” (start: 0x86A Size: 0x939) and save it to a new file named encoded.bin.
b4

Now it’s time to make our python utility to decode the encoded.bin file. We are at a bit of an advantage since we already know what the C pseudo code looks like for the routine. According to the XOR decoding routine pseudocode each byte is going to be XOR encoded by first the byte two to the left, and then one to the left. I came up with the following python script which decodes the encoded.bin file.

#!/usr/bin/python
import sys

def main():
	b = bytearray(open(encrypted, 'rb').read())
	r = len(b) -1
	print r

	while (r > 2):
	    b[r] = b[r] ^ b[r-2] 
	    b[r] = b[r] ^ b[r-1]
	    r = r-1
	open('output.bin', 'wb').write(b)

if len(sys.argv) != 2:
	print "Enter the encrypted filename as your argument."
	sys.exit(0)

else:
	encrypted = sys.argv[1]
	"Check your directory for the output.bin file"

if __name__ == '__main__':
	main()

Great success!

after

I think this article went on long enough so I will call it a day.

Leave a Reply