I’ve been doing web application security for over two years now, and while I’ve seen the tools get better and better, there is also an upper limit that they seem to have reached. In the early days of web application security, most of the issues were fairly straight-forward. Improper server configuration and easily-identifiable input validation issues were two of the most common problems and very easy to detect with an automated tool. As web applications have matured (disclaimer: some still have a long way to go), the problems have gotten a little more obscure and more difficult to detect.
The Problems
Take the case of parameter manipulation, for example. In most cases, parameter manipulation is as simple as changing a form submission variable and having data returned to you that you shouldn’t have access to. Although extremely easy for a human to test, it is quite a challenge for an automated web application to understand what it should and should not have access to. This also comes into play in the admin vs. user case where I’ve seen admin pages simply be restricted by the fact that they are not linked to from the user’s session.
Allowing a user to upload a file to your webserver is also extremely risky functionality. Nevertheless, it may be necessary. A web application scanner has to first identify potential file upload capability (which they can do), but then they have to determine if it’s dangerous…which they really can’t. This isn’t just your standard PUT HTTP method…
With respect to input validation, web app scanners can detect the simple SQL injection errors and cross-site scripting that are instantly re-displayed to the page, but what about the more complex situations? Take, for example, the form field that will happily accept html input but not actually output the malicious script until a summary page is accessed five pages further into the application. Or perhaps the application is doing some filtering of script characters, but it can be bypassed with a little manipulation.
When a web application scanner is referred to as “stateful”, it means that it can evaluate a web application that requires the browser to maintain some sort of state, such as being logged in to the application. It does not, however, mean that the scanner is aware of the current state of the application itself and the data within that application! This is a minor distinction, but a rather important one as web applications evolve and become more complicated.
Also, a web app scanner cannot safely scan a production environment. There is way to much risk involved and while this is not an ideal means of security testing, it does happen from time-to-time. Throwing a web application scanner at your dynamic production website is like sticking your hand in a wood-chipper and seeing if it comes out whole on the other end.
Finally, encryption. Encryption is not easy. There are well-tested standards for implementing crypto. Many programmers don’t use them and feel the need to implement their own whether it’s for “speed” or some other reason. They’re web app scanners…not crypto-analysis tools (although some have some quite good tools built in, once you identify the data to be tested…).
Possible Solutions?
I don’t feel like I should just point out the problems, so here are some potential solutions!
Parameter Manipulation/Authorization - Allow the web application scanner to utilize multiple accounts for testing. Try to access the same set of pages and see what the differences are. Although there are definitely flaws in this approach, user intervention could assist in making this easier to achieve. Most stateful web app scanners are based off an interactive user session, so it wouldn’t be too difficult to have a human identify pages that only that user should have access to.
File Uploads - Attempt to determine if cetain file types are allowed or disallowed through the file upload form. While this may also require user intervention, it might not be too difficult to attempt to upload a set of files with different extensions and compare the results. Realistically, however, the variety of web applications makes this nearly impossible to do effectively.
Advanced Input Validation - Make a real stateful web application scanning engine. Maintain an internal hash structure of the application’s pages and how a user advances through those pages. If an interactive session is used as the base of the scanning, attempt to identify what pages have output that was previously input by a user, as well as the path to those pages. This will require quite a bit of logic, but theoretically seems feasible. In addition, allow for the ability for some page variables to be dynamic. I haven’t seen this too often myself, but occasionally web applications will automatically increment or otherwise change some variable from page to page and the inability of a web app scanner to adjust for that will cause it to be inefficient.
Statefulness - As mentioned above, try to make web application scanning engines even smarter than they are today. Web apps are becoming more complex and as they do so more advanced methods of tracking user data throughout the application are necessary.
Production testing - It might be possible to have a set of safe and unsafe tests, but the effectiveness of this approach will be severly limited. Safe production testing can only be done by hand.
Encryption - Good luck! Encryption is hard enough to get right while using it, nevermind trying to detect when it’s being used improperly.
To summarize, while web application scanners are good for evaluating the base level of security within a web application, there are a myriad of issues that currently cannot be detected by web application scanners available today. While this is fairly well-known among security professionals, managers or other upper-level executives may not realize that the $100,000 they just dropped on automated web scanning is not nearly as effective as they believe it to be. At this point in time, it still takes an experienced security person to not only analyze the results of the scans, but also to perform the necessary manual tests to catch what the scanners cannot. Toss in some source code review and you might have a good idea of your application’s security posture - it’s no replacement for having security embedded into every stage of your SDLC, however! *wink*
I close with a few quotations:
Computers are magnificent tools for the realization of our dreams, but no machine can replace the human spark of spirit, compassion, love, and understanding.
- Louis Gerstner, CEO, IBM
Any science or technology which is sufficiently advanced is indistinguishable from magic.
- Arthur C. Clarke
Any technology that is distinguishable from magic is not sufficiently advanced.
- Gregory Benford