Automated crawling of web applications is the first step in automating web application security analysis. Without a proper crawl, automated testing of vulnerabilities will reveal incomplete results (AKA "False Negatives").
Modern web technologies, like AJAX, result in more responsive and usable web applications, sometimes called Rich Internet Applications (RIAs). Traditional crawling techniques are not sufficient for crawling RIAs.
Our security research team has proposed several crawling methods in the past and is constantly enhancing the research. We present a new strategy for crawling RIAs. This new strategy is designed based on the concept of “Model-Based Crawling” (previously introduced by our team) and uses statistics accumulated during the crawl to select what to explore next with a high probability of uncovering some new information. The performance of our strategy is compared with our previous strategy, as well as the classical Breadth-First and Depth-First on two real RIAs and two test RIAs. The results show this new strategy is significantly better than the Breadth-First and the Depth-First strategies (which are widely used to crawl RIAs), and outperforms our previous strategy while being much simpler to implement.
Our research paper:
A Statistical Approach for Efficient Crawling of Rich Internet Applications
A Statistical Approach for Efficient Crawling of Rich Internet Applications (Long version)
(Research by: Mustafa Emre Dincturk, Suryakant Choudhary, Gregor von Bochmann, Guy-Vincent Jourdan, and Iosif Viorel Onut)
This summer (27.7.2012), we are presenting our latest results at ICWE 2012 in Germany, as part of the Web Crawling Track.
Comments