Cross Site Scripting (XSS) Attack

March 22, 2013 | balvvant2006 | Application Security

Introduction

This post is part of a multi-post series on web application security threats and their solutions. Please visit my introduction article here to know about more security threats and their solutions.

This type of attack enables attackers to inject client-side script into web pages viewed by other users. XSS flaws occur whenever an application takes untrusted data and sends it to a web browser without proper validation and escaping. XSS allows attackers to execute scripts in the victim’s browser which can hijack user sessions, deface web sites, or redirect the user to malicious sites. According to WhiteHat security Statistics report, approximately 65% has XSS security issues.

Example of XSS Attack

Suppose we are developing a page as follows.

Let’s consider this common scenario where user can enter name and email Id. And on click of Signup button click a personalized response welcoming user, is displayed. The problem is, oftentimes that string in the thank you message is just the input data directly rewritten to the screen.

var name = txtName.Text;
var message = “Thank you ” + name;
lblSignupComplete.Text = message;

If there is no data validation and a user enters following value in the name field. Suppose the user types

alert(“You are hacked”)

in the email id field.

Then the output of the above code will execute the script and will result in displaying an alert pop-up box on the web page which was not the desired behavior. This is a small example what a hacker can do with XSS attack. The hacker can also use such holes to redirect a user to unwanted sites too.

This happened because of two reasons:

There was no expectation set as to what an acceptable parameter value was.
The application took the parameter value and rendered it into the HTML source code precisely. It trusted that whatever the value contained was suitable for rendering on the page.
There was no sanitation of data output while displaying on the web page.

Recommendations

To eliminate such instances and other types of XSS attacks, we need to follow some coding rules. These rules are:

All input must be validated against a white-list of acceptable value ranges
Always use request validation:

Request validation is the .NET framework’s native defense against XSS. By default, it is turned ON. Unless explicitly turned off, all ASP.NET web apps will look for potentially malicious input and throw the error above along with an HTTP 500 if detected. So without writing a single line of code, most of the XSS exploits would never occur. It’s an effective but primitive control which operates by looking for some pretty simple character patterns. But what if one of those character patterns is actually intended user input? What if a user is inputting data in rich HTML editor control? In such cases, we can turn off the validation within the page directive of the ASPX.

Alternatively, request validation can be turned off across the entire site within the web.config. But this is not a smart idea unless there is a really good reason why we’d want to remove this safety net from every single page on the site.

User inputs should be encoded using HTMLEncode and URLEncode functions

Another essential defense against XSS is a proper use of output encoding. The idea of output encoding is to ensure each character in a string is rendered so that it appears correctly in the output media. For example, in order to render the text in the browser we need to encode it into <i> otherwise it will take on a functional meaning and not render to the screen.

Non-HTML output encoding

In real world web applications we cannot encode all output to HTML. JavaScript is an excellent example of this. Let’s imagine that we wanted to return the response in a JavaScript alert box:

var name = Server.HtmlEncode(txtName.Text);
var message = “Welcome Mr. ” + name;
var alertScript = “alert(‘” + message + “‘);”; ClientScript.RegisterClientScriptBlock(GetType(), “ThankYou”, alertScript);

Let’s try this with above example: If user types Welcome Mr. ABCD

Obviously, this isn’t what we want to see as encoded HTML simply doesn’t play nice with JavaScript – they both have totally different encoding syntaxes. This brings us to the Anti-XSS library.

Anti-XSS

JavaScript output encoding is one of the reasons for Microsoft Anti-Cross Site Scripting Library also known as Anti-XSS to exist. This is a CodePlex project with encoding algorithms for HTML, XML, CSS and of course, JavaScript.
A fundamental difference between the encoding performed by Anti-XSS and that done by the native HtmlEncode method is that the former is working against a whitelist whilst the latter to a blacklist. The whitelist approach is most of the time a more secure route. Consequently, the Anti-XSS library is a preferable choice even for HTML encoding.
Moving onto JavaScript, let’s use the library to apply proper JavaScript encoding to the previous example:

var name = AntiXss.JavaScriptEncode(txtName.Text, false);
var message = “Welcome Mr. ” + name;
var alertScript = “alert(‘” + message + “‘);”; ClientScript.RegisterClientScriptBlock(GetType(), “Welcome”, alertScript);

We’ll now find a very different piece of syntax to when we were encoding for HTML:

alert(‘Welcome Mr. ABCD \x3ci\x3eHunt\x3c\x2fi\x3e’);

And we’ll actually get a JavaScript alert containing the precise string entered into the textbox:

Using an encoding library like Anti-XSS is absolutely essential. The last thing you want to be doing is manually working through all the possible characters and escape combinations to try and write your own output encoder. It’s hard work, it quite likely won’t be comprehensive enough and it’s totally unnecessary.
One last comment on Anti-XSS functionality; as well as output encoding, the library also has the functionality to render “safe” HTML by removing malicious scripts. If, for example, you have an application which legitimately stores markup in the data layer, and it is to be redisplayed to the page, the GetSafeHtml and GetSafeHtmlFragment methods will sanitize the data and remove scripts. Using this method rather than HtmlEncode means hyperlinks, text formatting, and another safe markup will functionally render (the behaviors will work) whilst the nasty stuff is stripped.

SRE

The Anti-XSS product has another good component called the Security Runtime Engine (SRE). This is essentially an HTTP module that hooks into the pre-render event in the page lifecycle and encodes server controls before they appear on the page. You have quite granular control over which controls and attributes are encoded and it’s a very easy retrofit to an existing app.

Set the Correct Character Encoding
To successfully restrict valid data for your Web pages, you should limit the ways in which the input data can be represented. This prevents malicious users from using canonicalization and multi-byte escape sequences to trick your input validation routines. A multi-byte escape sequence attack is a subtle manipulation that uses the fact that character encodings, such as uniform translation format-8 (UTF-8), use multi-byte sequences to represent non-ASCII characters. Some byte sequences are not legitimate UTF-8, but they may be accepted by some UTF-8 decoders, thus providing an exploitable security hole.
ASP.NET allows you to specify the character set at the page level or at the application level by using the <globalization> element in the Web.config file. The following code examples show both approaches and use the ISO-8859-1 character encoding, which is the default in early versions of HTML and HTTP.
To set the character encoding at the page level, use the <meta> element or the ResponseEncodingpage-level attribute as follows:

To set the character encoding in the Web.config file, use the following configuration.

HTML Escape before Inserting Untrusted Data into HTML Element Content
Attribute Escape before Inserting Untrusted Data into HTML Common Attributes
JavaScript Escape before Inserting Untrusted Data into HTML JavaScript Data Values
CSS Escape before Inserting Untrusted Data into HTML Style Property Values
URL Escape before Inserting Untrusted Data into HTML URL Parameter Values
Use an HTML Policy engine to validate or clean user-driven HTML in an outbound way
Prevent DOM-based XSS

References

https://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet
http://www.opensecuritylab.org/xss-prevention-rules

Hope you will find it helpful. Please stay tuned for more articles in this series in introduction article here.

Introduction

This post is part of a multi-post series on web application security threats and their solutions. Please visit my introduction article here to know about more security threats and their solutions.

Example of XSS Attack

Suppose we are developing a page as follows.

var name = txtName.Text;

var message = “Thank you ” + name;

lblSignupComplete.Text = message;

If there is no data validation and user enters following value in the name field. Suppose the user types

alert(“You are hacked”)

in the email id field.

Then the output of the above code will execute the script and will result in displaying an alert pop-up box on web page which was not a desired behavior. This is a small example what a hacker can to with XSS attack. The hacker can also use such holes to redirect user to unwanted sites too.

This happened because of two reasons:

There was no expectation set as to what an acceptable parameter value was.
The application took the parameter value and rendered it into the HTML source code precisely. It trusted that whatever the value contained was suitable for rendering on the page.
There was no sanitation of data output while displaying on the web page.

Recommendations

To eliminate such instances and other types of XSS attacks, we need to follow some coding rules. These rules are:

All input must be validated against a white-list of acceptable value ranges

Always use request validation

Request validation is the .NET framework’s native defence against XSS. By default it is turned ON. Unless explicitly turned off, all ASP.NET web apps will look for potentially malicious input and throw the error above along with an HTTP 500 if detected. So without writing a single line of code, most of the XSS exploits would never occur. It’s an effective but primitive control which operates by looking for some pretty simple character patterns. But what if one of those character patterns is actually intended user input? What if user is inputting data in rich HTML editor control? In such cases we can turn off the validation within the page directive of the ASPX:

Alternatively, request validation can be turned off across the entire site within the web.config:

But this is not a smart idea unless there is a really good reason why we’d want to remove this safety net from every single page in the site.

User inputs should be encoded using HTMLEncode and URLEncode functions

Another essential defence against XSS is proper use of output encoding. The idea of output encoding is to ensure each character in a string is rendered so that it appears correctly in the output media. For example, in order to render the text in the browser we need to encode it into <i> otherwise it will take on functional meaning and not render to the screen.

Non-HTML output encoding

In real world web applications we cannot encode all output to HTML. JavaScript is an excellent example of this. Let’s imagine that we wanted to return the response in a JavaScript alert box:

var name = Server.HtmlEncode(txtName.Text);

var message = “Welcome Mr. ” + name;

var alertScript = “alert(‘” + message + “‘);”; ClientScript.RegisterClientScriptBlock(GetType(), “ThankYou”, alertScript);

Let’s try this with above example: If user types Welcome Mr. ABCD

Obviously this isn’t what we want to see as encoded HTML simply doesn’t play nice with JavaScript – they both have totally different encoding syntaxes. This brings us to the Anti-XSS library.

Anti-XSS

A fundamental difference between the encoding performed by Anti-XSS and that done by the native HtmlEncode method is that the former is working against a whitelist whilst the latter to a blacklist. The whitelist approach is most of the time a more secure route. Consequently, the Anti-XSS library is a preferable choice even for HTML encoding.

Moving onto JavaScript, let’s use the library to apply proper JavaScript encoding to the previous example:

var name = AntiXss.JavaScriptEncode(txtName.Text, false);

var message = “Welcome Mr. ” + name;

var alertScript = “alert(‘” + message + “‘);”; ClientScript.RegisterClientScriptBlock(GetType(), “Welcome”, alertScript);

We’ll now find a very different piece of syntax to when we were encoding for HTML:

alert(‘Welcome Mr. ABCD \x3ci\x3eHunt\x3c\x2fi\x3e’);

And we’ll actually get a JavaScript alert containing the precise string entered into the textbox:

One last comment on Anti-XSS functionality; as well as output encoding, the library also has functionality to render “safe” HTML by removing malicious scripts. If, for example, you have an application which legitimately stores markup in the data layer, and it is to be redisplayed to the page, the GetSafeHtml and GetSafeHtmlFragment methods will sanitise the data and remove scripts. Using this method rather than HtmlEncode means hyperlinks, text formatting and other safe markup will functionally render (the behaviours will work) whilst the nasty stuff is stripped.

SRE

Set the Correct Character Encoding

To successfully restrict valid data for your Web pages, you should limit the ways in which the input data can be represented. This prevents malicious users from using canonicalization and multi-byte escape sequences to trick your input validation routines. A multi-byte escape sequence attack is a subtle manipulation that uses the fact that character encodings, such as uniform translation format-8 (UTF-8), use multi-byte sequences to represent non-ASCII characters. Some byte sequences are not legitimate UTF-8, but they may be accepted by some UTF-8 decoders, thus providing an exploitable security hole.

ASP.NET allows you to specify the character set at the page level or at the application level by using the <globalization> element in the Web.config file. The following code examples show both approaches and use the ISO-8859-1 character encoding, which is the default in early versions of HTML and HTTP.

To set the character encoding at the page level, use the <meta> element or the ResponseEncodingpage-level attribute as follows:

To set the character encoding in the Web.config file, use the following configuration.

HTML Escape before Inserting Untrusted Data into HTML Element Content
Attribute Escape before Inserting Untrusted Data into HTML Common Attributes
JavaScript Escape before Inserting Untrusted Data into HTML JavaScript Data Values
CSS Escape before Inserting Untrusted Data into HTML Style Property Values
URL Escape before Inserting Untrusted Data into HTML URL Parameter Values
Use an HTML Policy engine to validate or clean user-driven HTML in an outbound way
Prevent DOM-based XSS

References

https://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet

http://www.opensecuritylab.org/xss-prevention-rules

Hope you will find it helpful. Please stay tuned for more articles in this series in introduction article here.