Serializable
, Document
, StyledDocument
public class HTMLDocument extends DefaultStyledDocument
HTMLDocument.HTMLReader
, which implements the HTMLEditorKit.ParserCallback
protocol that the parser expects. To change the structure one can subclass HTMLReader
, and reimplement the method getReader(int)
to return the new reader implementation. The documentation for HTMLReader
should be consulted for the details of the default structure created. The intent is that the document be non-lossy (although reproducing the HTML format may result in a different format). The document models only HTML, and makes no attempt to store view attributes in it. The elements are identified by the StyleContext.NameAttribute
attribute, which should always have a value of type HTML.Tag
that identifies the kind of element. Some of the elements (such as comments) are synthesized. The HTMLFactory
uses this attribute to determine what kind of view to build.
This document supports incremental loading. The TokenThreshold
property controls how much of the parse is buffered before trying to update the element structure of the document. This property is set by the EditorKit
so that subclasses can disable it.
The Base
property determines the URL against which relative URLs are resolved. By default, this will be the Document.StreamDescriptionProperty
if the value of the property is a URL. If a <BASE> tag is encountered, the base will become the URL specified by that tag. Because the base URL is a property, it can of course be set directly.
The default content storage mechanism for this document is a gap buffer (GapContent
). Alternatives can be supplied by using the constructor that takes a Content
implementation.
In addition to the methods provided by Document and StyledDocument for mutating an HTMLDocument, HTMLDocument provides a number of convenience methods. The following methods can be used to insert HTML content into an existing document.
setInnerHTML(Element, String)
setOuterHTML(Element, String)
insertBeforeStart(Element, String)
insertAfterStart(Element, String)
insertBeforeEnd(Element, String)
insertAfterEnd(Element, String)
The following examples illustrate using these methods. Each example assumes the HTML document is initialized in the following way:
JEditorPane p = new JEditorPane(); p.setContentType("text/html"); p.setText("..."); // Document text is provided below. HTMLDocument d = (HTMLDocument) p.getDocument();
With the following HTML content:
<html> <head> <title>An example HTMLDocument</title> <style type="text/css"> div { background-color: silver; } ul { color: blue; } </style> </head> <body> <div id="BOX"> <p>Paragraph 1</p> <p>Paragraph 2</p> </div> </body> </html>
All the methods for modifying an HTML document require an Element
. Elements can be obtained from an HTML document by using the method getElement(Element e, Object attribute, Object value)
. It returns the first descendant element that contains the specified attribute with the given value, in depth-first order. For example, d.getElement(d.getDefaultRootElement(),
StyleConstants.NameAttribute, HTML.Tag.P)
returns the first paragraph element.
A convenient shortcut for locating elements is the method getElement(String)
; returns an element whose ID
attribute matches the specified value. For example, d.getElement("BOX")
returns the DIV
element.
The getIterator(HTML.Tag t)
method can also be used for finding all occurrences of the specified HTML tag in the document.
Elements can be inserted before or after the existing children of any non-leaf element by using the methods insertAfterStart
and insertBeforeEnd
. For example, if e
is the DIV
element, d.insertAfterStart(e, "<ul><li>List
Item</li></ul>")
inserts the list before the first paragraph, and d.insertBeforeEnd(e, "<ul><li>List
Item</li></ul>")
inserts the list after the last paragraph. The DIV
block becomes the parent of the newly inserted elements.
Sibling elements can be inserted before or after any element by using the methods insertBeforeStart
and insertAfterEnd
. For example, if e
is the DIV
element, d.insertBeforeStart(e,
"<ul><li>List Item</li></ul>")
inserts the list before the DIV
element, and d.insertAfterEnd(e,
"<ul><li>List Item</li></ul>")
inserts the list after the DIV
element. The newly inserted elements become siblings of the DIV
element.
Elements and all their descendants can be replaced by using the methods setInnerHTML
and setOuterHTML
. For example, if e
is the DIV
element, d.setInnerHTML(e, "<ul><li>List
Item</li></ul>")
replaces all children paragraphs with the list, and d.setOuterHTML(e, "<ul><li>List
Item</li></ul>")
replaces the DIV
element itself. In latter case the parent of the list is the BODY
element.
The following table shows the example document and the results of various methods described above.
Example | insertAfterStart | insertBeforeEnd | insertBeforeStart | insertAfterEnd | setInnerHTML | setOuterHTML |
---|---|---|---|---|---|---|
Paragraph 1 Paragraph 2 |
Paragraph 1 Paragraph 2 | Paragraph 1 Paragraph 2
|
Paragraph 1 Paragraph 2 | Paragraph 1 Paragraph 2
|
|
|
Warning: Serialized objects of this class will not be compatible with future Swing releases. The current serialization support is appropriate for short term storage or RMI between applications running the same version of Swing. As of 1.4, support for long term storage of all JavaBeans has been added to the java.beans
package. Please see XMLEncoder
.
Modifier and Type | Class | Description |
---|---|---|
class |
HTMLDocument.BlockElement |
An element that represents a structural block of HTML. |
class |
HTMLDocument.HTMLReader |
An HTML reader to load an HTML document with an HTML element structure. |
static class |
HTMLDocument.Iterator |
An iterator to iterate over a particular type of tag. |
class |
HTMLDocument.RunElement |
An element that represents a chunk of text that has a set of HTML character level attributes assigned to it. |
DefaultStyledDocument.AttributeUndoableEdit, DefaultStyledDocument.ElementBuffer, DefaultStyledDocument.ElementSpec, DefaultStyledDocument.SectionElement
AbstractDocument.AbstractElement, AbstractDocument.AttributeContext, AbstractDocument.BranchElement, AbstractDocument.Content, AbstractDocument.DefaultDocumentEvent, AbstractDocument.ElementEdit, AbstractDocument.LeafElement
Modifier and Type | Field | Description |
---|---|---|
static final String |
AdditionalComments |
Document property key value. |
buffer, BUFFER_SIZE_DEFAULT
BAD_LOCATION, BidiElementName, ContentElementName, ElementNameAttribute, listenerList, ParagraphElementName, SectionElementName
StreamDescriptionProperty, TitleProperty
Constructor | Description |
---|---|
HTMLDocument() |
Constructs an HTML document using the default buffer size and a default StyleSheet . |
HTMLDocument |
Constructs an HTML document with the given content storage implementation and the given style/attribute storage mechanism. |
HTMLDocument |
Constructs an HTML document with the default content storage implementation and the specified style/attribute storage mechanism. |
Modifier and Type | Method | Description |
---|---|---|
protected void |
create |
Replaces the contents of the document with the given element specifications. |
protected Element |
createBranchElement |
Creates a document branch element, that can contain other elements. |
protected AbstractDocument.AbstractElement |
createDefaultRoot() |
Creates the root element to be used to represent the default document structure. |
protected Element |
createLeafElement |
Creates a document leaf element that directly represents text (doesn't have any children). |
protected void |
fireChangedUpdate |
Notifies all listeners that have registered interest for notification on this event type. |
protected void |
fireUndoableEditUpdate |
Notifies all listeners that have registered interest for notification on this event type. |
URL |
getBase() |
Returns the location to resolve relative URLs against. |
Element |
getElement |
Returns the element that has the given id Attribute . |
Element |
getElement |
Returns the child element of e that contains the attribute, attribute with value value , or null if one isn't found. |
HTMLDocument.Iterator |
getIterator |
Fetches an iterator for the specified HTML tag. |
HTMLEditorKit.Parser |
getParser() |
Returns the parser that is used when inserting HTML into the existing document. |
boolean |
getPreservesUnknownTags() |
Returns the behavior the parser observes when encountering unknown tags. |
HTMLEditorKit.ParserCallback |
getReader |
Fetches the reader for the parser to use when loading the document with HTML. |
HTMLEditorKit.ParserCallback |
getReader |
Returns the reader for the parser to use to load the document with HTML. |
StyleSheet |
getStyleSheet() |
Fetches the StyleSheet with the document-specific display rules (CSS) that were specified in the HTML document itself. |
int |
getTokenThreshold() |
Gets the number of tokens to buffer before trying to update the documents element structure. |
protected void |
insert |
Inserts new elements in bulk. |
void |
insertAfterEnd |
Inserts the HTML specified as a string after the end of the given element. |
void |
insertAfterStart |
Inserts the HTML specified as a string at the start of the element. |
void |
insertBeforeEnd |
Inserts the HTML specified as a string at the end of the element. |
void |
insertBeforeStart |
Inserts the HTML specified as a string before the start of the given element. |
protected void |
insertUpdate |
Updates document structure as a result of text insertion. |
void |
processHTMLFrameHyperlinkEvent |
Processes HyperlinkEvents that are generated by documents in an HTML frame. |
void |
setBase |
Sets the location to resolve relative URLs against. |
void |
setInnerHTML |
Replaces the children of the given element with the contents specified as an HTML string. |
void |
setOuterHTML |
Replaces the given element in the parent with the contents specified as an HTML string. |
void |
setParagraphAttributes |
Sets attributes for a paragraph. |
void |
setParser |
Sets the parser that is used by the methods that insert html into the existing document, such as setInnerHTML , and setOuterHTML . |
void |
setPreservesUnknownTags |
Determines how unknown tags are handled by the parser. |
void |
setTokenThreshold |
Sets the number of tokens to buffer before trying to update the documents element structure. |
addDocumentListener, addStyle, getBackground, getCharacterElement, getDefaultRootElement, getFont, getForeground, getLogicalStyle, getParagraphElement, getStyle, getStyleNames, removeDocumentListener, removeElement, removeStyle, removeUpdate, setCharacterAttributes, setLogicalStyle, styleChanged
addUndoableEditListener, createPosition, dump, fireInsertUpdate, fireRemoveUpdate, getAsynchronousLoadPriority, getAttributeContext, getBidiRootElement, getContent, getCurrentWriter, getDocumentFilter, getDocumentListeners, getDocumentProperties, getEndPosition, getLength, getListeners, getProperty, getRootElements, getStartPosition, getText, getText, getUndoableEditListeners, insertString, postRemoveUpdate, putProperty, readLock, readUnlock, remove, removeUndoableEditListener, render, replace, setAsynchronousLoadPriority, setDocumentFilter, setDocumentProperties, writeLock, writeUnlock
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
addUndoableEditListener, createPosition, getEndPosition, getLength, getProperty, getRootElements, getStartPosition, getText, getText, insertString, putProperty, remove, removeUndoableEditListener, render
public static final String AdditionalComments
public HTMLDocument()
StyleSheet
. This is a convenience method for the constructor HTMLDocument(Content, StyleSheet)
.public HTMLDocument(StyleSheet styles)
HTMLDocument(Content, StyleSheet)
.styles
- the stylespublic HTMLDocument(AbstractDocument.Content c, StyleSheet styles)
c
- the container for the contentstyles
- the stylespublic HTMLEditorKit.ParserCallback getReader(int pos)
HTMLDocument.HTMLReader
. Subclasses can reimplement this method to change how the document gets structured if desired. (For example, to handle custom tags, or structurally represent character style elements.)pos
- the starting positionpublic HTMLEditorKit.ParserCallback getReader(int pos, int popDepth, int pushDepth, HTML.Tag insertTag)
HTMLDocument.HTMLReader
. Subclasses can reimplement this method to change how the document gets structured if desired. (For example, to handle custom tags, or structurally represent character style elements.) This is a convenience method for getReader(int, int, int, HTML.Tag, TRUE)
.
pos
- the starting positionpopDepth
- the number of ElementSpec.EndTagTypes
to generate before insertingpushDepth
- the number of ElementSpec.StartTagTypes
with a direction of ElementSpec.JoinNextDirection
that should be generated before inserting, but after the end tags have been generatedinsertTag
- the first tag to start inserting into documentpublic URL getBase()
public void setBase(URL u)
This also sets the base of the StyleSheet
to be u
as well as the base of the document.
u
- the desired base URLprotected void insert(int offset, DefaultStyledDocument.ElementSpec[] data) throws BadLocationException
insert
in class DefaultStyledDocument
offset
- the starting offsetdata
- the element dataBadLocationException
- if the given position does not represent a valid location in the associated document.protected void insertUpdate(AbstractDocument.DefaultDocumentEvent chng, AttributeSet attr)
insertUpdate
in class DefaultStyledDocument
chng
- a description of the document changeattr
- the attributesprotected void create(DefaultStyledDocument.ElementSpec[] data)
create
in class DefaultStyledDocument
data
- the new contents of the documentpublic void setParagraphAttributes(int offset, int length, AttributeSet s, boolean replace)
This method is thread safe, although most Swing methods are not. Please see Concurrency in Swing for more information.
setParagraphAttributes
in interface StyledDocument
setParagraphAttributes
in class DefaultStyledDocument
offset
- the offset into the paragraph (must be at least 0)length
- the number of characters affected (must be at least 0)s
- the attributesreplace
- whether to replace existing attributes, or merge thempublic StyleSheet getStyleSheet()
StyleSheet
with the document-specific display rules (CSS) that were specified in the HTML document itself.StyleSheet
public HTMLDocument.Iterator getIterator(HTML.Tag t)
t
- the requested HTML.Tag
Iterator
for the given HTML tagprotected Element createLeafElement(Element parent, AttributeSet a, int p0, int p1)
HTMLDocument.RunElement
.createLeafElement
in class AbstractDocument
parent
- the parent elementa
- the attributes for the elementp0
- the beginning of the range (must be at least 0)p1
- the end of the range (must be at least p0)protected Element createBranchElement(Element parent, AttributeSet a)
HTMLDocument.BlockElement
.createBranchElement
in class AbstractDocument
parent
- the parent elementa
- the attributesprotected AbstractDocument.AbstractElement createDefaultRoot()
createDefaultRoot
in class DefaultStyledDocument
public void setTokenThreshold(int n)
n
- the number of tokens to bufferpublic int getTokenThreshold()
Integer.MAX_VALUE
.public void setPreservesUnknownTags(boolean preservesTags)
preservesTags
- true if unknown tags should be saved in the model, otherwise tags are droppedpublic boolean getPreservesUnknownTags()
public void processHTMLFrameHyperlinkEvent(HTMLFrameHyperlinkEvent e)
HyperlinkEvents
that are generated by documents in an HTML frame. The HyperlinkEvent
type, as the parameter suggests, is HTMLFrameHyperlinkEvent
. In addition to the typical information contained in a HyperlinkEvent
, this event contains the element that corresponds to the frame in which the click happened (the source element) and the target name. The target name has 4 possible values: HTML.Attribute.SRC
attribute and fires a ChangedUpdate
event. If the target is _parent, then it deletes the parent element, which is a <FRAMESET> element, and inserts a new <FRAME> element, and sets its HTML.Attribute.SRC
attribute to have a value equal to the destination URL and fire a RemovedUpdate
and InsertUpdate
.
If the target is _top, this method does nothing. In the implementation of the view for a frame, namely the FrameView
, the processing of _top is handled. Given that _top implies replacing the entire document, it made sense to handle this outside of the document that it will replace.
If the target is a named frame, then the element hierarchy is searched for an element with a name equal to the target, its HTML.Attribute.SRC
attribute is updated and a ChangedUpdate
event is fired.
e
- the eventpublic void setParser(HTMLEditorKit.Parser parser)
setInnerHTML
, and setOuterHTML
. HTMLEditorKit.createDefaultDocument
will set the parser for you. If you create an HTMLDocument
by hand, be sure and set the parser accordingly.
parser
- the parser to be used for text insertionpublic HTMLEditorKit.Parser getParser()
public void setInnerHTML(Element elem, String htmlText) throws BadLocationException, IOException
This will be seen as at least two events, n inserts followed by a remove.
Consider the following structure (the elem
parameter is in bold).
<body> | <div> / \ <p> <p>
Invoking setInnerHTML(elem, "<ul><li>")
results in the following structure (new elements are in blue).
<body> | <div> \ <ul> \ <li>
Parameter elem
must not be a leaf element, otherwise an IllegalArgumentException
is thrown. If either elem
or htmlText
parameter is null
, no changes are made to the document.
For this to work correctly, the document must have an HTMLEditorKit.Parser
set. This will be the case if the document was created from an HTMLEditorKit via the createDefaultDocument
method.
elem
- the branch element whose children will be replacedhtmlText
- the string to be parsed and assigned to elem
IllegalArgumentException
- if elem
is a leafIllegalStateException
- if an HTMLEditorKit.Parser
has not been definedBadLocationException
- if replacement is impossible because of a structural issueIOException
- if an I/O exception occurspublic void setOuterHTML(Element elem, String htmlText) throws BadLocationException, IOException
This will be seen as at least two events, n inserts followed by a remove.
When replacing a leaf this will attempt to make sure there is a newline present if one is needed. This may result in an additional element being inserted. Consider, if you were to replace a character element that contained a newline with <img> this would create two elements, one for the image, and one for the newline.
If you try to replace the element at length you will most likely end up with two elements, eg setOuterHTML(getCharacterElement (getLength()),
"blah")
will result in two leaf elements at the end, one representing 'blah', and the other representing the end element.
Consider the following structure (the elem
parameter is in bold).
<body> | <div> / \ <p> <p>
Invoking setOuterHTML(elem, "<ul><li>")
results in the following structure (new elements are in blue).
<body> | <ul> \ <li>
If either elem
or htmlText
parameter is null
, no changes are made to the document.
For this to work correctly, the document must have an HTMLEditorKit.Parser set. This will be the case if the document was created from an HTMLEditorKit via the createDefaultDocument
method.
elem
- the element to replacehtmlText
- the string to be parsed and inserted in place of elem
IllegalStateException
- if an HTMLEditorKit.Parser has not been setBadLocationException
- if replacement is impossible because of a structural issueIOException
- if an I/O exception occurspublic void insertAfterStart(Element elem, String htmlText) throws BadLocationException, IOException
Consider the following structure (the elem
parameter is in bold).
<body> | <div> / \ <p> <p>
Invoking insertAfterStart(elem,
"<ul><li>")
results in the following structure (new elements are in blue).
<body> | <div> / | \ <ul> <p> <p> / <li>
Unlike the insertBeforeStart
method, new elements become children of the specified element, not siblings.
Parameter elem
must not be a leaf element, otherwise an IllegalArgumentException
is thrown. If either elem
or htmlText
parameter is null
, no changes are made to the document.
For this to work correctly, the document must have an HTMLEditorKit.Parser
set. This will be the case if the document was created from an HTMLEditorKit via the createDefaultDocument
method.
elem
- the branch element to be the root for the new texthtmlText
- the string to be parsed and assigned to elem
IllegalArgumentException
- if elem
is a leafIllegalStateException
- if an HTMLEditorKit.Parser has not been set on the documentBadLocationException
- if insertion is impossible because of a structural issueIOException
- if an I/O exception occurspublic void insertBeforeEnd(Element elem, String htmlText) throws BadLocationException, IOException
If elem
's children are leaves, and the character at a elem.getEndOffset() - 1
is a newline, this will insert before the newline so that there isn't text after the newline.
Consider the following structure (the elem
parameter is in bold).
<body> | <div> / \ <p> <p>
Invoking insertBeforeEnd(elem, "<ul><li>")
results in the following structure (new elements are in blue).
<body> | <div> / | \ <p> <p> <ul> \ <li>
Unlike the insertAfterEnd
method, new elements become children of the specified element, not siblings.
Parameter elem
must not be a leaf element, otherwise an IllegalArgumentException
is thrown. If either elem
or htmlText
parameter is null
, no changes are made to the document.
For this to work correctly, the document must have an HTMLEditorKit.Parser
set. This will be the case if the document was created from an HTMLEditorKit via the createDefaultDocument
method.
elem
- the element to be the root for the new texthtmlText
- the string to be parsed and assigned to elem
IllegalArgumentException
- if elem
is a leafIllegalStateException
- if an HTMLEditorKit.Parser has not been set on the documentBadLocationException
- if insertion is impossible because of a structural issueIOException
- if an I/O exception occurspublic void insertBeforeStart(Element elem, String htmlText) throws BadLocationException, IOException
Consider the following structure (the elem
parameter is in bold).
<body> | <div> / \ <p> <p>
Invoking insertBeforeStart(elem,
"<ul><li>")
results in the following structure (new elements are in blue).
<body> / \ <ul> <div> / / \ <li> <p> <p>
Unlike the insertAfterStart
method, new elements become siblings of the specified element, not children.
If either elem
or htmlText
parameter is null
, no changes are made to the document.
For this to work correctly, the document must have an HTMLEditorKit.Parser
set. This will be the case if the document was created from an HTMLEditorKit via the createDefaultDocument
method.
elem
- the element the content is inserted beforehtmlText
- the string to be parsed and inserted before elem
IllegalStateException
- if an HTMLEditorKit.Parser has not been set on the documentBadLocationException
- if insertion is impossible because of a structural issueIOException
- if an I/O exception occurspublic void insertAfterEnd(Element elem, String htmlText) throws BadLocationException, IOException
Consider the following structure (the elem
parameter is in bold).
<body> | <div> / \ <p> <p>
Invoking insertAfterEnd(elem, "<ul><li>")
results in the following structure (new elements are in blue).
<body> / \ <div> <ul> / \ \ <p> <p> <li>
Unlike the insertBeforeEnd
method, new elements become siblings of the specified element, not children.
If either elem
or htmlText
parameter is null
, no changes are made to the document.
For this to work correctly, the document must have an HTMLEditorKit.Parser
set. This will be the case if the document was created from an HTMLEditorKit via the createDefaultDocument
method.
elem
- the element the content is inserted afterhtmlText
- the string to be parsed and inserted after elem
IllegalStateException
- if an HTMLEditorKit.Parser has not been set on the documentBadLocationException
- if insertion is impossible because of a structural issueIOException
- if an I/O exception occurspublic Element getElement(String id)
Attribute
. If the element can't be found, null
is returned. Note that this method works on an Attribute
, not a character tag. In the following HTML snippet: <a id="HelloThere">
the attribute is 'id' and the character tag is 'a'. This is a convenience method for getElement(RootElement, HTML.Attribute.id, id)
. This is not thread-safe.id
- the string representing the desired Attribute
Attribute
or null
if it can't be found, or null
if id
is null
public Element getElement(Element e, Object attribute, Object value)
e
that contains the attribute, attribute
with value value
, or null
if one isn't found. This is not thread-safe.e
- the root element where the search beginsattribute
- the desired Attribute
value
- the values for the specified Attribute
Attribute
and the specified value
, or null
if it can't be foundprotected void fireChangedUpdate(DocumentEvent e)
fireChangedUpdate
in class AbstractDocument
e
- the eventprotected void fireUndoableEditUpdate(UndoableEditEvent e)
fireUndoableEditUpdate
in class AbstractDocument
e
- the event
© 1993, 2023, Oracle and/or its affiliates. All rights reserved.
Documentation extracted from Debian's OpenJDK Development Kit package.
Licensed under the GNU General Public License, version 2, with the Classpath Exception.
Various third party code in OpenJDK is licensed under different licenses (see Debian package).
Java and OpenJDK are trademarks or registered trademarks of Oracle and/or its affiliates.
https://docs.oracle.com/en/java/javase/21/docs/api/java.desktop/javax/swing/text/html/HTMLDocument.html