The XPath API supports XML Path Language (XPath) Version 1.0
The XPath language provides a simple, concise syntax for selecting nodes from an XML document. XPath also provides rules for converting a node in an XML document object model (DOM) tree to a boolean, double, or string value. XPath is a W3C-defined language and an official W3C recommendation; the W3C hosts the XML Path Language (XPath) Version 1.0 specification.
XPath started in life in 1999 as a supplement to the XSLT and XPointer languages, but has more recently become popular as a stand-alone language, as a single XPath expression can be used to replace many lines of DOM API code.
An XPath expression is composed of a location path and one or more optional predicates. Expressions may also include XPath variables.
The following is an example of a simple XPath expression:
/foo/bar
This example would select the <bar>
element in an XML document such as the following:
<foo> <bar/> </foo>
The expression /foo/bar
is an example of a location path. While XPath location paths resemble Unix-style file system paths, an important distinction is that XPath expressions return all nodes that match the expression. Thus, all three <bar>
elements in the following document would be selected by the /foo/bar
expression:
<foo> <bar/> <bar/> <bar/> </foo>
A special location path operator, //
, selects nodes at any depth in an XML document. The following example selects all <bar>
elements regardless of their location in a document:
//bar
A wildcard operator, *, causes all element nodes to be selected. The following example selects all children elements of a <foo>
element:
/foo/*
In addition to element nodes, XPath location paths may also address attribute nodes, text nodes, comment nodes, and processing instruction nodes. The following table gives examples of location paths for each of these node types:
Location Path | Description |
---|---|
/foo/bar/@id | Selects the attribute id of the <bar> element |
/foo/bar/text() | Selects the text nodes of the <bar> element. No distinction is made between escaped and non-escaped character data. |
/foo/bar/comment() | Selects all comment nodes contained in the <bar> element. |
/foo/bar/processing-instruction() | Selects all processing-instruction nodes contained in the <bar> element. |
Predicates allow for refining the nodes selected by an XPath location path. Predicates are of the form [expression]
. The following example selects all <foo>
elements that contain an include
attribute with the value of true
:
//foo[@include='true']
Predicates may be appended to each other to further refine an expression, such as:
//foo[@include='true'][@mode='bar']
While XPath expressions select nodes in the XML document, the XPath API allows the selected nodes to be coalesced into one of the following data types:
Boolean
Number
String
QName
types to represent return types of an XPath evaluation: XPathConstants.NODESET
XPathConstants.NODE
XPathConstants.STRING
XPathConstants.BOOLEAN
XPathConstants.NUMBER
The return type is specified by a QName
parameter in method call used to evaluate the expression, which is either a call to XPathExpression.evaluate(...)
or XPath.evaluate(...)
methods.
When a Boolean
return type is requested, Boolean.TRUE
is returned if one or more nodes were selected; otherwise, Boolean.FALSE
is returned.
The String
return type is a convenience for retrieving the character data from a text node, attribute node, comment node, or processing-instruction node. When used on an element node, the value of the child text nodes is returned.
The Number
return type attempts to coalesce the text of a node to a double
data type.
XPathExpression.evaluateExpression(...)
or XPath.evaluateExpression(...)
methods. The XPath data types are mapped to Class types as follows: Boolean
-- Boolean.class
Number
-- Number.class
String
-- String.class
Nodeset
-- XPathNodes.class
Node
-- Node.class
Of the subtypes of Number
, only Double, Integer
and Long
are supported.
XPathEvaluationResult.XPathResultType
that provide mappings between the QName and Class types above. The result of evaluating an expression using the XPathExpression.evaluateExpression(...)
or XPath.evaluateExpression(...)
methods will be of one of these types. Note the differences between the Enum and QName mappings:
NUMBER
NUMBER
supports Double, Integer
and Long
.NODESET
NODESET
is XPathNodes
instead of NodeList
in the QName mapping. XPath location paths may be relative to a particular node in the document, known as the context
. A context consists of:
It is an XML document tree represented as a hierarchy of nodes, a Node
for example, in the JDK implementation.
<widgets> <widget> <manufacturer/> <dimensions/> </widget> </widgets>
The <widget>
element can be selected with the following process:
// parse the XML as a W3C Document DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder(); Document document = builder.parse(new File("/widgets.xml")); //Get an XPath object and evaluate the expression XPath xpath = XPathFactory.newInstance().newXPath(); String expression = "/widgets/widget"; Node widgetNode = (Node) xpath.evaluate(expression, document, XPathConstants.NODE); //or using the evaluateExpression method Node widgetNode = xpath.evaluateExpression(expression, document, Node.class);
With a reference to the <widget>
element, a relative XPath expression can be written to select the <manufacturer>
child element:
XPath xpath = XPathFactory.newInstance().newXPath(); String expression = "manufacturer"; Node manufacturerNode = (Node) xpath.evaluate(expression, widgetNode, XPathConstants.NODE); //or using the evaluateExpression method Node manufacturerNode = xpath.evaluateExpression(expression, widgetNode, Node.class);
In the above example, the XML file is read into a DOM Document before being passed to the XPath API. The following code demonstrates the use of InputSource to leave it to the XPath implementation to process it:
XPath xpath = XPathFactory.newInstance().newXPath(); String expression = "/widgets/widget"; InputSource inputSource = new InputSource("widgets.xml"); NodeList nodes = (NodeList) xpath.evaluate(expression, inputSource, XPathConstants.NODESET); //or using the evaluateExpression method XPathNodes nodes = xpath.evaluateExpression(expression, inputSource, XPathNodes.class);
In the above cases, the type of the expected results are known. In case where the result type is unknown or any type, the XPathEvaluationResult
may be used to determine the return type. The following code demonstrates the usage:
XPathEvaluationResult<?> result = xpath.evaluateExpression(expression, document); switch (result.type()) { case NODESET: XPathNodes nodes = (XPathNodes)result.value(); ... break; }
The XPath 1.0 Number data type is defined as a double. However, the XPath specification also provides functions that returns Integer type. To facilitate such operations, the XPath API allows Integer and Long to be used in evaluateExpression
method such as the following code:
int count = xpath.evaluateExpression("count(/widgets/widget)", document, Integer.class);
Class | Description |
---|---|
XPath | XPath provides access to the XPath evaluation environment and expressions. |
XPathConstants | XPath constants. |
XPathEvaluationResult<T> | The XPathEvaluationResult interface represents the result of the evaluation of an XPath expression within the context of a particular node. |
XPathEvaluationResult.XPathResultType | XPathResultType represents possible return types of an XPath evaluation. |
XPathException | XPathException represents a generic XPath exception. |
XPathExpression | XPathExpression provides access to compiled XPath expressions. |
XPathExpressionException | XPathExpressionException represents an error in an XPath expression. |
XPathFactory | An XPathFactory instance can be used to create XPath objects. |
XPathFactoryConfigurationException | XPathFactoryConfigurationException represents a configuration error in a XPathFactory environment. |
XPathFunction | XPathFunction provides access to XPath functions. |
XPathFunctionException | XPathFunctionException represents an error with an XPath function. |
XPathFunctionResolver | XPathFunctionResolver provides access to the set of user defined XPathFunction s. |
XPathNodes | XPathNodes represents a set of nodes selected by a location path as specified in XML Path Language (XPath) Version 1.0, 3.3 Node-sets. |
XPathVariableResolver | XPathVariableResolver provides access to the set of user defined XPath variables. |
© 1993, 2023, Oracle and/or its affiliates. All rights reserved.
Documentation extracted from Debian's OpenJDK Development Kit package.
Licensed under the GNU General Public License, version 2, with the Classpath Exception.
Various third party code in OpenJDK is licensed under different licenses (see Debian package).
Java and OpenJDK are trademarks or registered trademarks of Oracle and/or its affiliates.
https://docs.oracle.com/en/java/javase/21/docs/api/java.xml/javax/xml/xpath/package-summary.html