Or: how to avoid the Assembly Line Syndrome
Recently, I’ve heard several security experts talk about the efficiency of automated web application scanners. Specifically, they raise claims that automated scanners are only good for:
- "Low Hanging Fruits" vulnerabilities
- "Technical vulnerabilities"
They all say that automated scanners cannot handle the "logical vulnerabilities". I thought it might be a good time/place to explain the difference between the types of vulnerabilities, and to explain why I think that every healthy security review of an application, should always contain both automated and manual assessments.
Let’s start with explaining what are "Technical Vulnerabilities", for that, I will quote Jeremiah Grossman, in his article "Technology alone cannot defeat Web application attacks: Understanding technical vs. logical vulnerabilities":
Web application vulnerability scanners depend on the relative predictability of Web sites to identify security issues. Using a loose set of rules, scanners function by simulating Web attacks and analyzing the responses for telltale signs of weakness. From experience, we know how a Web site will normally react when there is a security issue present. We know that if sending a Web site certain meta-characters produces a database ODBC error message, a SQL Injection issue has likely been detected
So, let’s summarize what Jeremiah is saying:
Technical Vulnerabilities, are security issues in web applications, that can be detected with an algorithm, by sending certain HTTP requests, observing the HTTP responses, and deciding if the issue is real or not based on preliminary knowledge of proper/improper application behavior.
Let’s see how Jeremiah explains what are “Logical Vulnerabilities”:
Consider the following example. If we visit a Web site and are presented with the following URL: http://example/order.asp?item=50&price=300.00
Can we guess what the application order.asp combined with the parameters item and price do? Using intelligence unique to humans, we can quickly deduce their purpose with relative certainty. This is a product ordering application. The item parameter is the particular product we are interested in. In our case, let's say an iPod. The price parameter is the amount we are going to pay for our portable music player. What happens if we changed the price of 300.00 to 100.00? Or 1.00? Does the Web site still sell us the iPod? If so, we can easily understand that the Web site should not have allowed the price alteration. As humans, we possess a natural ability to assess context, and we aptly refer to these types of issues as "logical vulnerabilities," issues that only humans can identify.
Ok, let’s summarize again:
Logical Vulnerabilities, are security issues in web applications, that require human deduction abilities, and the ability to assess things in the proper context, in order to detect them.
Just as a side note, while I agree with Jeremiah’s separation (and loose definition) between Technical/Logical vulnerabilities, I can argue about his last example – in my opinion (and I base this on real world examples), most shopping cart “hidden” price manipulations belong under the Technical vulnerabilities, and can be automated with a good web application scanner.
So, we have only one thing to define, and that is "Low Hanging Fruit". Again, Jeremiah did quite a good job in one of his blog posts, when he said:
A common approach to vulnerability assessment (VA) is going after the so-called "low-hanging fruit" (LHF). The idea is to remove the easy stuff making break-ins more challenging without investing a lot of work and expense
So, low hanging fruit vulnerabilities, means "easy-to-find" vulnerabilities. Does this mean they are Technical Vulnerabilities? Logical Vulnerabilities? Maybe both? This is a bit unclear, but since we are trying to clear things up, lets stick to the verbatim explanation – Low Hanging Fruit Vulnerabilities, are security issues that are easy to spot.
So now that we have the definitions in place, we can start talking business. The common claim is that web application scanners can only find Technical Vulnerabilities, is this good or bad? let’s take a look at a quick list I’ve compiled in a minute, of Technical vulnerabilities:
- Shell command execution (Perl pipe)
- HTTP PUT site defacement
- Backup files that were left behind
- Blind SQL Injection
- SQL Injection
- Xpath Injection
- LDAP Injection
- Directory Listing
- Path Traversal in parameters
- Insecure HTTP methods
- SSI in parameters
- Phishing using URL redirection
- Path Traversal in URL
- All known issues in Web Servers (IIS, Apache, etc.), Application Severs and other 3rd party products
- Some cookie tampering vulnerabilities
- Format String vulnerabilities
- Buffer Overflows
- Information leakage (error messages)
- Administration pages access
- Source code disclosure vulnerabilities
- Some types of Privilege Escalation
- Some types of Session Fixation
- Injections in SOAP Web Services messages
- HTTP Response Splitting
- Shopping cart price manipulations
- Poison Null Byte vulnerabilities
- Some PHP Remote File Inclusions
- DOM-based XSS
- Insecure Indexing
- Information leakage in HTML comments
As you can see, the list of Technical Vulnerabilities that can be fully (or partially) automated is pretty long, and I really only spent a few minutes thinking about it....I am sure there's plenty more items to add.
Now, I am going to prove a simple point –
I’ve browsed my personal banking web application for approximately 10 minutes, trying to perform some trivial banking application actions such as viewing my account details, viewing my account balance, etc.. During this time, my proxy collected 158 different URLs, and spotted 322 different parameter names.
Let's do some simple math:
- Checking for XSS takes maybe 2-3 requests until you get it right
- Checking for SQL Injection takes approx. 3 requests
- Checking for Blind SQL Injection takes approx. 3 requests
- Checking for Path Traversal in parameters may take somewhere between 1-10 requests
Assuming it takes a person with an HTTP proxy about 10 seconds to compose or manipulate an HTTP request, and assuming you don’t care about the order/context of the parameters in the URL, it will take you:
- 3 Requests * 10 Seconds * 322 parameters = 2.6 hours to detect XSS
- 3 Requests * 10 Seconds * 322 parameters = 2.6 hours to detect SQL Injections
- 3 Requests * 10 Seconds * 322 parameters = 2.6 hours to detect BSQLi
- 1 Request (minimum) * 10 Seconds * 322 parameters = ~53 minutes
This means that it would take a person approx 9 hours (give or take), to test all parameters for XSS, (B)SQLi, and some Parameter Path Traversals - and that's only for a part of an application that took 10 minutes to browse. Sounds like fun, and would probably give you severe Assembly-Line Syndrome (or at least Carpal Tunnel Syndrome)…and I haven’t even talked about the rest of the parameter tampering tests, common vulnerabilities, file checks, cookie tampering checks, etc. that can be automated.
One might argue, that my calculations are biased (of course they are, I am trying to make a point), and that you don’t have to test each and every parameter in each and every URL, and that a human would probably understand the context in which the parameter is located in, it’s name and value, and where it is located in the response, and would probably require to send less requests. That’s absolutely true, I don’t deny it, but you can also sit and put caps on spray bottles in a factory and do a better job than a machine would do, but who wants to do that, right? It is time and resource consuming (as well as boring!)
To sum things up, I want to present a few thoughts I have on the subject of Automated vs. Manual testing:
- It is very hard to calculate the ratio between what automated scanners can find vs. what people can find – statements such as “Automated scanners can only find X%” are unsubstantiated. While we can count how many vulnerabilities can be found by an automated scanner, this number only represents the current moment, and does not take future technological advancements into consideration, and in addition, you can’t really count how many “Logical Vulnerabilities” a person can find as well – it greatly depends on that person’s knowledge and competency in webappsec, which varies greatly between people.
- Breaking into an application, only requires a single vulnerability, and hackers who do this for money, will most likely want to spend as little resources on it as possible. This screams “Low Hanging Fruit” to me. I believe that security is all about layers. The more layers you add, your chances of getting hacked will decrease. This means that reducing the LHF will repel hackers that are looking for quick/easy money. This doesn't mean that you shouldn't discover and fix complex vulnerabilities, it just means that you should prioritize your efforts.
- You should check out the WASC Web Application Hacking Incidents Databaseto see how many real-world sites got hacked by Low Hanging Fruits.
- Automated scanners are consistent, both in knowledge, and in the way they scan. You can reproduce vulnerabilities and testing techniques in order to make sure that things are fixed. Not all humans hold the same knowledge, and not all of them test applications in the same way and level of knowledge.
- Those who downplay the importance of automated scanners in a web security assessment, might just as well start to downplay NetCat, or HTTP Proxies. They also have shortcomings and they won't solve the problem on their own right? I wonder if we'll ever see/hear of Anti-NetCat evangelists :-)
I guess what I am trying to say is that I don’t think people should use the term “Automated vs. Manual” in the context of web application assessments. Securing web applications is not about “Man vs. Machine”, it’s more about “Man and Machine”, working together (I sound like something out of the Matrix movie).
Every healthy security assessment process, should use a balanced mix between automated penetration testing tools, and humans. Each side bringing his own expertise to the table, and together they compliment each other.