Note: This documentation reflects stale browser implementations. The specification has changed significantly since the docs were written, and they will need to be updated once browser implementations catch up.
The HTML Sanitizer API allow developers to take untrusted strings of HTML and Document
or DocumentFragment
objects, and sanitize them for safe insertion into a document's DOM.
Concepts and usage
Web applications often need to work with untrusted HTML on the client side, for example, as part of a client-side templating solution or for rendering user generated content, or when including data in a frame from another site. The Sanitizer API allows for rendering of this potentially untrusted HTML in a safe manner.
To access the API you would use the Sanitizer()
constructor to create and configure a Sanitizer
instance. The configuration options parameter allows you to specify the allowed and dis-allowed elements and attributes, and to enable custom elements and comments.
The most common use-case - preventing XSS - is handled by the default configuration. Creating a Sanitizer()
with a custom configuration is necessary only to handle additional, application-specific use cases.
The API has three main methods for sanitizing data:
-
Element.setHTML()
parses and sanitizes a string of HTML and immediately inserts it into the DOM as a child of the current element. This is essentially a "safe" version of Element.innerHTML
, and should be used instead of innerHTML
when inserting untrusted data. -
Sanitizer.sanitizeFor()
parses and sanitizes a string of HTML for later insertion into the DOM. This might be used when the target element for the string is not always ready/available for update. -
Sanitizer.sanitize()
sanitizes data that is in a Document
or DocumentFragment
. It might be used, for example, to sanitize a Document
instance in a frame.
Parsing and sanitizing strings
The result of parsing a string of HTML depends on the context/the element into which it is inserted.
For example, an HTML string containing <td>
elements is valid if inserted under a <table>
elements, but will be dropped if inserted in a <div>
element. Similarly, an <em>
element is a valid node in a <div>
but the tag will be escaped if used in a <textarea>
:
<div><em>bla</em></div>
<textarea><em>bla</textarea>
The target element must therefore be known when the parser is run and the resulting subtree must be inserted into that same type of element in the DOM, or the result will be incorrect.
For this reason when using Sanitizer.sanitizeFor()
developers must specify the tag of the eventual target element as a parameter, and the method returns a matching HTML element with the parsed string as a child (for example, the target tag "div"
results in a returned object that is an instance of HTMLDivElement
). The return type ensures that a user always has the context in which the object must be inserted into the DOM.
This consideration does not matter for Element.setHTML()
as it is called on a particular element and the context is therefore implicit.
The parser may also perform normalization operations on the input string. As a result, even if the HTML is valid and the sanitizer method does nothing, the sanitized output may not precisely match the unsanitized input. This applies to both methods.
Interfaces
-
Sanitizer
Experimental
-
Provides the functionality to define a sanitizer configuration, to sanitize untrusted strings of HTML for later insertion into the DOM, and to sanitize Document
and DocumentFragment
objects.
Element.setHTML()
-
Parses a string of HTML into a subtree of nodes, sanitizes it using a Sanitizer
object, then sets it as a child of the current element.
Examples
The following examples show how to use the sanitizer API using the default sanitizer (at time of writing configuration operations are not yet supported).
The code below demonstrates how Element.setHTML()
is used to sanitize a string of HTML and insert it into the Element
with an id of target
.
The script
element is disallowed by the default sanitizer so the alert is removed.
const unsanitized_string = "abc <script>alert(1)<" + "/script> def";
const sanitizer = new Sanitizer();
const target = document.getElementById("target");
target.setHTML(unsanitized_string, { sanitizer });
console.log(target.innerHTML);
Sanitize a string for deferred use
The example below shows the same sanitization operation using the Sanitizer.sanitizeFor()
method, with the intent of later inserting the returned element into a <div>
element:
const unsanitized_string = "abc <script>alert(1)<" + "/script> def";
const sanitizer = new Sanitizer();
const sanitizedDiv = sanitizer.sanitizeFor("div", unsanitized_string);
console.log(sanitizedDiv instanceof HTMLDivElement);
console.log(sanitizedDiv.innerHTML);
document.querySelector("div#target").replaceChildren(sanitizedDiv.children);
Note: If you really must perform a string-to-string operation you can extract the string using innerHTML
, but you must remember to use the correct context when the string is applied:
const unsanitized_string = "abc <script>alert(1)<" + "/script> def";
const sanitizedString = new Sanitizer().sanitizeFor(
"div",
unsanitized_string,
).innerHTML;
Sanitize a frame
To sanitize data from an <iframe>
with id userFrame
:
const sanitizer = new Sanitizer();
const frame_element = document.getElementById("userFrame");
const unsanitized_frame_tree = frame_element.contentWindow.document;
const sanitized_frame_tree = sanitizer.sanitize(unsanitized_frame_tree);
frame_element.replaceChildren(sanitized_frame_tree);
Specifications
Browser compatibility
|
Desktop |
Mobile |
|
Chrome |
Edge |
Firefox |
Internet Explorer |
Opera |
Safari |
WebView Android |
Chrome Android |
Firefox for Android |
Opera Android |
Safari on IOS |
Samsung Internet |
Sanitizer |
105–119 |
105–119 |
83 |
No |
91 |
No |
105 |
105–119 |
No |
72 |
No |
20.0 |
HTML_Sanitizer_API |
105–119 |
105–119 |
83 |
No |
91 |
No |
105 |
105–119 |
No |
72 |
No |
20.0 |
getConfiguration |
105–119 |
105–119 |
No |
No |
91 |
No |
105 |
105–119 |
No |
72 |
No |
20.0 |
getDefaultConfiguration_static |
105–119 |
105–119 |
No |
No |
91 |
No |
105 |
105–119 |
No |
72 |
No |
20.0 |
sanitize |
93–119 |
93–119 |
83 |
No |
79 |
No |
No |
93–119 |
No |
No |
No |
No |
sanitizeFor |
93–119 |
93–119 |
No |
No |
79 |
No |
No |
93–119 |
No |
No |
No |
No |