By Ory Segal
As much as I love SOAP web services (not!), it seems like RESTful web services really caught on and became a de-facto standard these days – you see them everywhere, in the cloud, in AJAX or Web 2.0 applications, mobile applications and so forth.
Unlike SOAP services, RESTful services are lightweight. They are extremely easy to understand and also to develop. Nevertheless, there seem to be a million different definitions as to what they really are, but I think the simplest way to understand them is by using the following four definitions, which I’ve found in this DeveloperWorks article:
- RESTful services use HTTP methods explicitly
- RESTful services are stateless
- RESTful services expose directory structure-like URIs
- RESTful services transfer XML, JSON or both
GET /data/users/Bob/ HTTP/1.1
If this was a standard HTTP request, I would tell you that there’s a good chance you’re looking at a web server that contains 3 directories under its virtual root /data/, /users/, and /Bob/, but that’s not the case. This request, tells the RESTful service to retrieve (GET) the account information for user Bob, which is a part of the /users/ list in our data repository.
When an automated scanner crawls the web application, there’s a good chance that out-of-the-box, it won’t figure out that we’re looking at a RESTful service here, and it will consider these parts of the URL as directories. This means a few things:
- Directory-level tests will be sent to the wrong places – potential false positives
- Parameter-level tests will not be sent to the right places – potential false negatives
IBM's AppScan Standard enables you to train it to cope with RESTful services, using one of two options – Manual or Automatic configuration. Let’s start with the manual option.
By default, AppScan automatically recognizes parameters in standard HTTP & HTML formats, but if parameters are in other formats (for example within the Path or within another parameter), you need to define them manually, so that AppScan would be able to recognize, follow and manipulate them during scanning. This is done from the Custom Parameters definition, which you can find under Scan Configuration -> Parameters and Cookies -> Advanced: Customer Parameters.
In order to create a new type of custom parameter definition, you have to click the “+” button, which opens the following screen:
Let’s see a step by step process of adding a definition that will properly parse and test our “users” parameter in the example above.
- We’ll start by giving this custom parameter definition the Reference Name RESTful_Path_Parameter
- In the Pattern field, we’ll enter the regular expression /data/([\d\w\s%]+)/([\d\w\s%]+)/ - This pattern includes two match groups, i.e. /data/group1/group2/, group1 denotes the parameter’s name, and group2 the parameter’s value
- Since the name of the parameter is the first match group, we will define the Name group index as “1”, and since the value of the parameter is the second match group, we will define the Value group index as “2”. This tells AppScan to extract the name of the parameter from group1, and the value of the parameter from group2. If you are dealing with a Path that only includes a parameter value (i.e. nameless parameters), you can set the Name group index to an empty value, and only mark a single value group
- Our RESTful service uses Path based parameters, so we’ll set the Location to “Path”. In general, you can set it to either “Body”, “Path”, or “Query”.
- In our scenario, we’ll leave the Condition Pattern empty. This pattern helps us to limit the behavior of the custom parameter definition, by setting another pattern match on the Location. For example, we could’ve defined the Condition pattern to be: ^/data/, and then our pattern parameter definition would only be relevant for Paths that actually begin with /data/.
- In addition, in our scenario, we will leave the Response Pattern empty. Just as an FYI - this pattern helps us to teach AppScan how to track the values of our custom parameter in scenarios where the application treats it as a session ID. In such cases, the application might not only embed new values in Paths (e.g. in web links), but also in other places in subsequent responses, such as XML elements, for example: <newSessionID>12345678</newSessionID> - In this case, we would have defined the following Response Pattern:
<newSessionID>([0-9]+)</newSessionID> - this tells AppScan that even though in the HTTP request, the parameter is called users, it should extract new values from an XML element in subsequent responses, that is called newSessionID. Tricky, complex but nevertheless useful!
That’s it. Once we have our custom parameter definition in place, we can let AppScan crawl and test the application normally. After the Explore phase, you can have a peek in the Data view, and look at the Script Parameters table:
As you can see above, each new RESTful parameter that is extracted and analyzed by AppScan is given a special name in the following format:
In our case, AppScan detected the users parameter with 2 values – Bob and Jane, and the books parameter with two values Bobs Biography and Janes Biography.
The INDEX part of the custom parameter is helpful if the regular expression that we created, caught on the same Path more than once. For example, consider the following Path:
our pattern would actually match twice on this Path - the first match (index = 0) would set the parameter name to be users and its value to be Bob, and the second match (index = 1) would set the parameter name to be phone and the its value to be areacode. In such case, the name of the custom parameter would appear as:
Explore Optimization Module
Mastering AppScan’s custom parameters definition could be a daunting task, but this feature is extremely powerful and will allow you to create complex definitions that could parse non-standard HTTP messages of any type and form. If you are in a hurry, lazy, or simply hate regular expressions, there’s an automated way to detect custom parameters by using AppScan’s Explore Optimization Module, which is available through the Tools->Extensions->Explore Optimization Module (Configure or Run):
This extension runs a smart algorithm that will statistically detect URL rewriting rules, such as those that are heavily used by RESTful web applications to generate its directory structure-like URLs. For example, given enough URLs of the format /data/users/VALUE/... this module will automatically generate a custom parameter definition for you.
How much is “enough URLs”? This depends on the configuration of the module and specifically on its Switch Complexity Limit, which by default is set to 50, meaning that you must have 50 different values for the /users parameter.
If you want this module to automatically kick in during scans, you can enable it by going to: Tools->Extensions->Explore Optimization Module: Configure, and checking the box next to Always run automatically during scans. The module will start working once AppScan has crawled 1,000 URLs. You can increase or decrease this default threshold through the Minimum links to start module configuration. If you suspect that your application is using RESTful services, and the module was disabled when you first scanned it, you can always simply run it by going to: Tools->Extensions->Explore Optimization Module: Run
After the module ran, AppScan’s scan log will include special messages related to this module, for example:
There you go. All I had to do was to let AppScan crawl the application for a few minutes, then Run the module, and it automatically created a custom parameter definition with the regular expression users/([^/]+)
In general, the more URLs you have, the better this module will behave.
It is also iterative - if you continue scanning the application after it created the first round of definitions, and once it hit the threshold again, or once you clicked on Run, it will refine these rules and create new ones where needed. Simple and elegant, albeit less accurate and powerful than the manual option mentioned earlier. That's it.
This post was a bit long, you probably need a REST now.