Or: how to avoid the Assembly Line Syndrome
Recently, I’ve heard several security experts talk about the efficiency of automated web application scanners. Specifically, they raise claims that automated scanners are only good for:
- "Low Hanging Fruits" vulnerabilities
- "Technical vulnerabilities"
They all say that automated scanners cannot handle the "logical vulnerabilities". I thought it might be a good time/place to explain the difference between the types of vulnerabilities, and to explain why I think that every healthy security review of an application, should always contain both automated and manual assessments.
Let’s start with explaining what are "Technical Vulnerabilities", for that, I will quote Jeremiah Grossman, in his article "Technology alone cannot defeat Web application attacks: Understanding technical vs. logical vulnerabilities":
Web application vulnerability scanners depend on the relative predictability of Web sites to identify security issues. Using a loose set of rules, scanners function by simulating Web attacks and analyzing the responses for telltale signs of weakness. From experience, we know how a Web site will normally react when there is a security issue present. We know that if sending a Web site certain meta-characters produces a database ODBC error message, a SQL Injection issue has likely been detected
So, let’s summarize what Jeremiah is saying:
Technical Vulnerabilities, are security issues in web applications, that can be detected with an algorithm, by sending certain HTTP requests, observing the HTTP responses, and deciding if the issue is real or not based on preliminary knowledge of proper/improper application behavior.
Let’s see how Jeremiah explains what are “Logical Vulnerabilities”:
Consider the following example. If we visit a Web site and are presented with the following URL: http://example/order.asp?item=50&price=300.00Can we guess what the application order.asp combined with the parameters item and price do? Using intelligence unique to humans, we can quickly deduce their purpose with relative certainty. This is a product ordering application. The item parameter is the particular product we are interested in. In our case, let's say an iPod. The price parameter is the amount we are going to pay for our portable music player. What happens if we changed the price of 300.00 to 100.00? Or 1.00? Does the Web site still sell us the iPod? If so, we can easily understand that the Web site should not have allowed the price alteration. As humans, we possess a natural ability to assess context, and we aptly refer to these types of issues as "logical vulnerabilities," issues that only humans can identify.
Ok, let’s summarize again:
Logical Vulnerabilities, are security issues in web applications, that require human deduction abilities, and the ability to assess things in the proper context, in order to detect them.
Just as a side note, while I agree with Jeremiah’s separation (and loose definition) between Technical/Logical vulnerabilities, I can argue about his last example – in my opinion (and I base this on real world examples), most shopping cart “hidden” price manipulations belong under the Technical vulnerabilities, and can be automated with a good web application scanner.
So, we have only one thing to define, and that is "Low Hanging Fruit". Again, Jeremiah did quite a good job in one of his blog posts, when he said:
A common approach to vulnerability assessment (VA) is going after the so-called "low-hanging fruit" (LHF). The idea is to remove the easy stuff making break-ins more challenging without investing a lot of work and expense
So, low hanging fruit vulnerabilities, means "easy-to-find" vulnerabilities. Does this mean they are Technical Vulnerabilities? Logical Vulnerabilities? Maybe both? This is a bit unclear, but since we are trying to clear things up, lets stick to the verbatim explanation – Low Hanging Fruit Vulnerabilities, are security issues that are easy to spot.
So now that we have the definitions in place, we can start talking business. The common claim is that web application scanners can only find Technical Vulnerabilities, is this good or bad? let’s take a look at a quick list I’ve compiled in a minute, of Technical vulnerabilities:
- Shell command execution (Perl pipe)
- HTTP PUT site defacement
- Backup files that were left behind
- Blind SQL Injection
- SQL Injection
- Xpath Injection
- LDAP Injection
- Directory Listing
- Path Traversal in parameters
- Insecure HTTP methods
- SSI in parameters
- Phishing using URL redirection
- XSS
- Path Traversal in URL
- All known issues in Web Servers (IIS, Apache, etc.), Application Severs and other 3rd party products
- Some cookie tampering vulnerabilities
- Format String vulnerabilities
- Buffer Overflows
- Information leakage (error messages)
- Administration pages access
- Source code disclosure vulnerabilities
- Some types of Privilege Escalation
- Some types of Session Fixation
- Injections in SOAP Web Services messages
- HTTP Response Splitting
- Shopping cart price manipulations
- Poison Null Byte vulnerabilities
- Some PHP Remote File Inclusions
- DOM-based XSS
- Insecure Indexing
- Information leakage in HTML comments
- etc...
As you can see, the list of Technical Vulnerabilities that can be fully (or partially) automated is pretty long, and I really only spent a few minutes thinking about it....I am sure there's plenty more items to add.
Now, I am going to prove a simple point –
I’ve browsed my personal banking web application for approximately 10 minutes, trying to perform some trivial banking application actions such as viewing my account details, viewing my account balance, etc.. During this time, my proxy collected 158 different URLs, and spotted 322 different parameter names.
Let's do some simple math:
- Checking for XSS takes maybe 2-3 requests until you get it right
- Checking for SQL Injection takes approx. 3 requests
- Checking for Blind SQL Injection takes approx. 3 requests
- Checking for Path Traversal in parameters may take somewhere between 1-10 requests
Assuming it takes a person with an HTTP proxy about 10 seconds to compose or manipulate an HTTP request, and assuming you don’t care about the order/context of the parameters in the URL, it will take you:
- 3 Requests * 10 Seconds * 322 parameters = 2.6 hours to detect XSS
- 3 Requests * 10 Seconds * 322 parameters = 2.6 hours to detect SQL Injections
- 3 Requests * 10 Seconds * 322 parameters = 2.6 hours to detect BSQLi
- 1 Request (minimum) * 10 Seconds * 322 parameters = ~53 minutes
This means that it would take a person approx 9 hours (give or take), to test all parameters for XSS, (B)SQLi, and some Parameter Path Traversals - and that's only for a part of an application that took 10 minutes to browse. Sounds like fun, and would probably give you severe Assembly-Line Syndrome (or at least Carpal Tunnel Syndrome)…and I haven’t even talked about the rest of the parameter tampering tests, common vulnerabilities, file checks, cookie tampering checks, etc. that can be automated.
One might argue, that my calculations are biased (of course they are, I am trying to make a point), and that you don’t have to test each and every parameter in each and every URL, and that a human would probably understand the context in which the parameter is located in, it’s name and value, and where it is located in the response, and would probably require to send less requests. That’s absolutely true, I don’t deny it, but you can also sit and put caps on spray bottles in a factory and do a better job than a machine would do, but who wants to do that, right? It is time and resource consuming (as well as boring!)
To sum things up, I want to present a few thoughts I have on the subject of Automated vs. Manual testing:
- It is very hard to calculate the ratio between what automated scanners can find vs. what people can find – statements such as “Automated scanners can only find X%” are unsubstantiated. While we can count how many vulnerabilities can be found by an automated scanner, this number only represents the current moment, and does not take future technological advancements into consideration, and in addition, you can’t really count how many “Logical Vulnerabilities” a person can find as well – it greatly depends on that person’s knowledge and competency in webappsec, which varies greatly between people.
- Breaking into an application, only requires a single vulnerability, and hackers who do this for money, will most likely want to spend as little resources on it as possible. This screams “Low Hanging Fruit” to me. I believe that security is all about layers. The more layers you add, your chances of getting hacked will decrease. This means that reducing the LHF will repel hackers that are looking for quick/easy money. This doesn't mean that you shouldn't discover and fix complex vulnerabilities, it just means that you should prioritize your efforts.
- You should check out the WASC Web Application Hacking Incidents Databaseto see how many real-world sites got hacked by Low Hanging Fruits.
- Automated scanners are consistent, both in knowledge, and in the way they scan. You can reproduce vulnerabilities and testing techniques in order to make sure that things are fixed. Not all humans hold the same knowledge, and not all of them test applications in the same way and level of knowledge.
- Those who downplay the importance of automated scanners in a web security assessment, might just as well start to downplay NetCat, or HTTP Proxies. They also have shortcomings and they won't solve the problem on their own right? I wonder if we'll ever see/hear of Anti-NetCat evangelists :-)
I guess what I am trying to say is that I don’t think people should use the term “Automated vs. Manual” in the context of web application assessments. Securing web applications is not about “Man vs. Machine”, it’s more about “Man and Machine”, working together (I sound like something out of the Matrix movie).
Every healthy security assessment process, should use a balanced mix between automated penetration testing tools, and humans. Each side bringing his own expertise to the table, and together they compliment each other.
Let's break down the logical examples more clearly. They fall into a couple of categories:
1. What scanners can't do today, but *should* do tomorrow:
Examples of this include authentication issues, or obvious authorization issues (binary decisions about functionality access)
2. What scanners can't do today, and will have a hard time ever doing accurately:
Examples of this are subtle authorization issues, like SQL LIKE queries sitting on top of a lose authorization model. Without a threat model/mis-use case described in the scanner (which NO ONE on the planet does, outside of (some) professional assessment shops like I work for)...without either human eyeball context ("Oh, that's Bad!") or a pre-defined scanner rule or goal ('Rob can't query Sally's data) these will never be found.
3. Things scanners will never find:
Examples are like weak secrets used in password reset flow that includes email or out-of-band communications.
---
You are right: the emphasis on Getting The Job Done should be Man Plus Machine. I definitely agree with you there, and I bet Jeremiah does too.
However, there's still a bit of crappy hype and automation fantasy out there, and we need to objectively and clearly qualify and measure what is what, today, goals for tomorrow, and what bar we hit, to make this "art" the science it should be.
Nice blog, btw//
-ae
Posted by: Arian Evans | May 30, 2007 at 11:37 PM
Arian,
Your examples are a bit high level, and it is hard to understand what specific vulnerabilities you are referring to.
Let me ask you a question -
Do you think that tampering with a user ID value in a cookie (e.g. decrement the value), and accessing someone else's account details, can be automated or not?
From what I've heard and read, most people will say that this is too hard to automate (i.e. "how would a piece of software figure out that it switched context", right?)
I believe that such a scenario can actually be automated, given a good enough algorithm.
In general, I think that some of the tasks that you and Jeremiah refer to as hard or impossible, can actually be automated, and I believe that people should use the term "Halting Problem" in this context, a bit more carefully.
Posted by: AppSecInsider | May 31, 2007 at 01:05 PM
Example #1 is high level. The others are very specific, which is why I gave specific examples:
Your cookie tampering falls sometimes into #1, and other times into #2. It depends upon the cookie and the site, entirely. (unless you want to generate a ton of false positives)
1. Binary authorization decisions: should auth state A have access to /admin/manageusers.asp?
Yes|No. Scanners could and should do this. Today they do it between poorly, and not-at-all. A few try, but I have yet to see it done well. Soon I suspect. We're pretty good at this sort of thing.
Is authentication state, or authorization access, different between cookie 1001 and 1002 in a meaningful manner?
That can be automated, I suppose. But that isn't the authorization problem today, most commonly seen, IMO (circa 2007).
2. Authorization decisions: cookie Arian and cookie Ory. Can I swap them inside a valid session and bypass authorization checks? Yes, this can be automated, if you describe the problem up front. No, scanners are never going to automate that in click-scan-mode or while using only one set of user-creds. If the scanner is using multi-user auth, in some cases, yes.
In the case of functionality that sits on top of a weak (or non-existent) authorization model, that can be abused within the boundaries of that functionality: I gave the example of SQL LIKE queries that can be manipulated to return other user's data. That exists out in the real world. We find it. Scanners do not, and will never be highly accurate at this.
Unless you have a pre-defined mis-use case. Which I don't think anyone is ever going to give scanners.
3. Never going to find category: Again, I was very specific: if I can break your authentication, by resetting passwords to your accounts through your password reset functionality, and that involves weak secrets used in the decision making:
1. Scanners are never going to be useful at identifying weak secrets that are non-integer.
2. If there is OoB (out of band) decisions made, like in the case of a password reset email, today's scanners will never collect that data set to analyze.
This is not about having a stance to defend. It's reality. I mean: we had one of our mutual competitors give a customer of mine the green light, full bill of health, all good, with their website. No Vulns, No Issues.
We we able to take over accounts through one vuln, and access GLBA-restricted data for all users through another weakness (this one involved binary image data, another thing scanners are almost useless at analyzing).
The site was fundamentally broken in multiple ways that no one is going to algorithmically solve any time soon, if at all.
I'll eat those words when proven wrong, no problem with humble pie here. I just don't see it happening.
I do agree that the Turing/Halting problem is thrown around way too much when discussing this.
I myself like to mention Rice's theorem as an even better example of the problem.
We need better stats and analysis of common vulns that "can be found", right? Then we could have this discussion in a more meaningful way, and mock up proof cases.
Right now it's sort of folks like me saying "you can't find that" and others say "yes I can" or "I don't know what you are talking about, do you mean: [insert trivial scanner automation problem]?".
I think it would be more productive to define and measure,
Posted by: Arian | June 01, 2007 at 02:40 AM
Hi Arian,
This is what I call a discussions :-)
1. Binary authorization decisions - your example is covered by AppScan's Privilege Escalation testing capabilities (check out my whitepaper on the subject at: https://www.watchfire.com/securearea/whitepapers.aspx?id=24 )
2. I don’t see a problem with automating a test that swaps cookies during session, to attempt to bypass authorization mechanisms. Although I do find this example a bit problematic – if I am switching session cookies, what did you expect would happen? (of course you'll switch sessions, that's what session cookies are all about, no?) Unless of course the session cookie, and the identity token (either cookies/parameter) are not the same one (in which case, like I’ve said – automation is possible to some extent)
3. Regarding the SQL LIKE queries, I would love to see some real world examples. That would definitely help to figure out why you think it is a tough one
4. Checking for weak password reset forms can be automated to some extent, for example, you can check if the form requires the old password, or you can check if access to that form requires you to be logged in or not, etc. – obviously not all cases are covered.
5. Regarding OOB responses – obviously, not all scenarios can be covered, but if the scanning software includes a useful extensions framework or an SDK (like AppScan does :-)), you can attempt to automate some of the peculiar scenarios, such as receiving one-time passwords for login via a mobile phone connection, or picking up information from emails, etc. – again, this is far from being perfect or complete, but it might increase the things you can do with the software.
Anyway, for the most part, I agree with you. There are things that scanners don’t do today, and they should be doing (we are working hard on improving all the time), there are things that will be hard to automate in the future, but I am sure we’ll get there eventually. And, there are things that require a human to figure them out – no argument here.
Thanks,
-Ory
Posted by: AppSecInsider | June 01, 2007 at 05:47 PM
05Jun2007 (UTC +8)
Cool topic. "Faster, Better, Cheaper" is what I've adopted too.
lehitraot,
--Drexx
Posted by: Drexx Laggui | June 05, 2007 at 07:28 AM