org.w3c.dom.ls
Interface LSParser
An interface to an object that is able to build, or augment, a DOM tree
from various input sources.
LSParser
provides an API for parsing XML and building the
corresponding DOM document structure. A
LSParser
instance
can be obtained by invoking the
DOMImplementationLS.createLSParser()
method.
As specified in [
DOM Level 3 Core]
, when a document is first made available via the LSParser:
- there will
never be two adjacent nodes of type NODE_TEXT, and there will never be
empty text nodes.
- it is expected that the
value
and
nodeValue
attributes of an Attr
node initially
return the XML 1.0
normalized value. However, if the parameters "
validate-if-schema" and "
datatype-normalization" are set to true
, depending on the attribute normalization
used, the attribute values may differ from the ones obtained by the XML
1.0 attribute normalization. If the parameters "
datatype-normalization" is set to false
, the XML 1.0 attribute normalization is
guaranteed to occur, and if the attributes list does not contain
namespace declarations, the attributes
attribute on
Element
node represents the property [attributes] defined in [XML Information Set]
.
Asynchronous
LSParser
objects are expected to also
implement the
events::EventTarget
interface so that event
listeners can be registered on asynchronous
LSParser
objects.
Events supported by asynchronous
LSParser
objects are:
LSParser
LSLoadEvent
LSParser
LSProgressEvent
Note: All events defined in this specification use the
namespace URI
"http://www.w3.org/2002/DOMLS"
.
While parsing an input source, errors are reported to the application
through the error handler (
LSParser.domConfig
's "
error-handler" parameter). This specification does in no way try to define all possible
errors that can occur while parsing XML, or any other markup, but some
common error cases are defined. The types (
DOMError.type
) of
errors and warnings defined by this specification are:
"check-character-normalization-failure" [error]
-
check-character-normalization
"doctype-not-allowed" [fatal]
true
"no-input-specified" [fatal]
LSInput
"pi-base-uri-not-preserved" [warning]
-
entities
false
<!DOCTYPE root [ <!ENTITY e SYSTEM 'subdir/myentity.ent' ]>
<root> &e; </root>
subdir/myentity.ent
<one> <two/> </one> <?pi 3.14159?>
<more/>
"unbound-prefix-in-entity" [warning]
-
namespaces
true
"unknown-character-denormalization" [fatal]
false
"unsupported-encoding" [fatal]
"unsupported-media-type" [fatal]
true
In addition to raising the defined errors and warnings, implementations
are expected to raise implementation specific errors and warnings for any
other error and warning cases such as IO errors (file not found,
permission denied,...), XML well-formedness errors, and so on.
See also the
Document Object Model (DOM) Level 3 Load
and Save Specification.
static short | ACTION_APPEND_AS_CHILDREN - Append the result of the parse operation as children of the context
node.
|
static short | ACTION_INSERT_AFTER - Insert the result of the parse operation as the immediately following
sibling of the context node.
|
static short | ACTION_INSERT_BEFORE - Insert the result of the parse operation as the immediately preceding
sibling of the context node.
|
static short | ACTION_REPLACE - Replace the context node with the result of the parse operation.
|
static short | ACTION_REPLACE_CHILDREN - Replace all the children of the context node with the result of the
parse operation.
|
void | abort() - Abort the loading of the document that is currently being loaded by
the
LSParser .
|
boolean | getAsync() -
true if the LSParser is asynchronous,
false if it is synchronous.
|
boolean | getBusy() -
true if the LSParser is currently busy
loading a document, otherwise false .
|
DOMConfiguration | getDomConfig() - The
DOMConfiguration object used when parsing an input
source.
|
LSParserFilter | getFilter() - When a filter is provided, the implementation will call out to the
filter as it is constructing the DOM tree structure.
|
Document | parse(LSInput input) - Parse an XML document from a resource identified by a
LSInput .
|
Document | parseURI(String uri) - Parse an XML document from a location identified by a URI reference [IETF RFC 2396].
|
Node | parseWithContext(LSInput input, Node contextArg, short action) - Parse an XML fragment from a resource identified by a
LSInput and insert the content into an existing document
at the position specified with the context and
action arguments.
|
void | setFilter(LSParserFilter filter) - When a filter is provided, the implementation will call out to the
filter as it is constructing the DOM tree structure.
|
ACTION_APPEND_AS_CHILDREN
public static final short ACTION_APPEND_AS_CHILDREN
Append the result of the parse operation as children of the context
node. For this action to work, the context node must be an
Element
or a DocumentFragment
.
ACTION_INSERT_AFTER
public static final short ACTION_INSERT_AFTER
Insert the result of the parse operation as the immediately following
sibling of the context node. For this action to work the context
node's parent must be an Element
or a
DocumentFragment
.
ACTION_INSERT_BEFORE
public static final short ACTION_INSERT_BEFORE
Insert the result of the parse operation as the immediately preceding
sibling of the context node. For this action to work the context
node's parent must be an Element
or a
DocumentFragment
.
ACTION_REPLACE
public static final short ACTION_REPLACE
Replace the context node with the result of the parse operation. For
this action to work, the context node must have a parent, and the
parent must be an Element
or a
DocumentFragment
.
ACTION_REPLACE_CHILDREN
public static final short ACTION_REPLACE_CHILDREN
Replace all the children of the context node with the result of the
parse operation. For this action to work, the context node must be an
Element
, a Document
, or a
DocumentFragment
.
abort
public void abort()
Abort the loading of the document that is currently being loaded by
the LSParser
. If the LSParser
is currently
not busy, a call to this method does nothing.
getAsync
public boolean getAsync()
true
if the LSParser
is asynchronous,
false
if it is synchronous.
getBusy
public boolean getBusy()
true
if the LSParser
is currently busy
loading a document, otherwise false
.
getDomConfig
public DOMConfiguration getDomConfig()
The
DOMConfiguration
object used when parsing an input
source. This
DOMConfiguration
is specific to the parse
operation. No parameter values from this
DOMConfiguration
object are passed automatically to the
DOMConfiguration
object on the
Document
that is created, or used, by the
parse operation. The DOM application is responsible for passing any
needed parameter values from this
DOMConfiguration
object to the
DOMConfiguration
object referenced by the
Document
object.
In addition to the parameters recognized in on the
DOMConfiguration interface defined in [
DOM Level 3 Core]
, the
DOMConfiguration
objects for
LSParser
add or modify the following parameters:
"charset-overrides-xml-encoding"
true
- optionaldefaultIETF RFC 2616XML 1.0
LSInput
false
- required
"disallow-doctype"
true
- optional"doctype-not-allowed"
false
- requireddefault
"ignore-unknown-character-denormalizations"
true
- requireddefaultXML 1.1XML 1.0
false
- optional"unknown-character-denormalization"
"infoset"
DOMConfiguration
DOM Level 3 Coretrue
LSParser
"namespaces"
true
- requireddefaultXML NamespacesXML Namespaces 1.1
false
- optional
"resource-resolver"
requiredLSResourceResolver
LSResourceResolver
"supported-media-types-only"
true
- optional"unsupported-media-type"IETF RFC 3023
false
- requireddefault
"validate"
DOMConfiguration
DOM Level 3 Corefalse
"validate-if-schema"
DOMConfiguration
DOM Level 3 Corefalse
"well-formed"
DOMConfiguration
DOM Level 3 Corefalse
getFilter
public LSParserFilter getFilter()
When a filter is provided, the implementation will call out to the
filter as it is constructing the DOM tree structure. The filter can
choose to remove elements from the document being constructed, or to
terminate the parsing early.
The filter is invoked after the operations requested by the
DOMConfiguration
parameters have been applied. For
example, if "
validate" is set to
true
, the validation is done before invoking the
filter.
parse
public Document parse(LSInput input)
throws DOMException,
LSException
Parse an XML document from a resource identified by a
LSInput
.
input
- The LSInput
from which the source of the
document is to be read.
- If the
LSParser
is a synchronous
LSParser
, the newly created and populated
Document
is returned. If the LSParser
is
asynchronous, null
is returned since the document
object may not yet be constructed when this method returns.
DOMException
- INVALID_STATE_ERR: Raised if the LSParser
's
LSParser.busy
attribute is true
.LSException
- PARSE_ERR: Raised if the LSParser
was unable to load
the XML document. DOM applications should attach a
DOMErrorHandler
using the parameter "
error-handler" if they wish to get details on the error.
parseURI
public Document parseURI(String uri)
throws DOMException,
LSException
Parse an XML document from a location identified by a URI reference [
IETF RFC 2396]. If the URI
contains a fragment identifier (see section 4.1 in [
IETF RFC 2396]), the
behavior is not defined by this specification, future versions of
this specification may define the behavior.
uri
- The location of the XML document to be read.
- If the
LSParser
is a synchronous
LSParser
, the newly created and populated
Document
is returned, or null
if an error
occured. If the LSParser
is asynchronous,
null
is returned since the document object may not yet
be constructed when this method returns.
DOMException
- INVALID_STATE_ERR: Raised if the LSParser.busy
attribute is true
.LSException
- PARSE_ERR: Raised if the LSParser
was unable to load
the XML document. DOM applications should attach a
DOMErrorHandler
using the parameter "
error-handler" if they wish to get details on the error.
parseWithContext
public Node parseWithContext(LSInput input,
Node contextArg,
short action)
throws DOMException,
LSException
Parse an XML fragment from a resource identified by a
LSInput
and insert the content into an existing document
at the position specified with the
context
and
action
arguments. When parsing the input stream, the
context node (or its parent, depending on where the result will be
inserted) is used for resolving unbound namespace prefixes. The
context node's
ownerDocument
node (or the node itself if
the node of type
DOCUMENT_NODE
) is used to resolve
default attributes and entity references.
As the new data is inserted into the document, at least one
mutation event is fired per new immediate child or sibling of the
context node.
If the context node is a
Document
node and the action
is
ACTION_REPLACE_CHILDREN
, then the document that is
passed as the context node will be changed such that its
xmlEncoding
,
documentURI
,
xmlVersion
,
inputEncoding
,
xmlStandalone
, and all other such attributes are set to
what they would be set to if the input source was parsed using
LSParser.parse()
.
This method is always synchronous, even if the
LSParser
is asynchronous (
LSParser.async
is
true
).
If an error occurs while parsing, the caller is notified through
the
ErrorHandler
instance associated with the "
error-handler" parameter of the
DOMConfiguration
.
When calling
parseWithContext
, the values of the
following configuration parameters will be ignored and their default
values will always be used instead: "
validate", "
validate-if-schema", and "
element-content-whitespace". Other parameters will be treated normally, and the parser is expected
to call the
LSParserFilter
just as if a whole document
was parsed.
input
- The LSInput
from which the source document
is to be read. The source document must be an XML fragment, i.e.
anything except a complete XML document (except in the case where
the context node of type DOCUMENT_NODE
, and the action
is ACTION_REPLACE_CHILDREN
), a DOCTYPE (internal
subset), entity declaration(s), notation declaration(s), or XML or
text declaration(s).contextArg
- The node that is used as the context for the data
that is being parsed. This node must be a Document
node, a DocumentFragment
node, or a node of a type
that is allowed as a child of an Element
node, e.g. it
cannot be an Attribute
node.action
- This parameter describes which action should be taken
between the new set of nodes being inserted and the existing
children of the context node. The set of possible actions is
defined in ACTION_TYPES
above.
- Return the node that is the result of the parse operation. If
the result is more than one top-level node, the first one is
returned.
DOMException
- HIERARCHY_REQUEST_ERR: Raised if the content cannot replace, be
inserted before, after, or as a child of the context node (see also
Node.insertBefore
or Node.replaceChild
in [DOM Level 3 Core]
).
NOT_SUPPORTED_ERR: Raised if the LSParser
doesn't
support this method, or if the context node is of type
Document
and the DOM implementation doesn't support
the replacement of the DocumentType
child or
Element
child.
NO_MODIFICATION_ALLOWED_ERR: Raised if the context node is a
read only node and the content is being appended to its child list,
or if the parent node of the context node is read only node and the
content is being inserted in its child list.
INVALID_STATE_ERR: Raised if the LSParser.busy
attribute is true
.LSException
- PARSE_ERR: Raised if the LSParser
was unable to load
the XML fragment. DOM applications should attach a
DOMErrorHandler
using the parameter "
error-handler" if they wish to get details on the error.
setFilter
public void setFilter(LSParserFilter filter)
When a filter is provided, the implementation will call out to the
filter as it is constructing the DOM tree structure. The filter can
choose to remove elements from the document being constructed, or to
terminate the parsing early.
The filter is invoked after the operations requested by the
DOMConfiguration
parameters have been applied. For
example, if "
validate" is set to
true
, the validation is done before invoking the
filter.
* Copyright (c) 2004 World Wide Web Consortium,
*
* (Massachusetts Institute of Technology, European Research Consortium for
* Informatics and Mathematics, Keio University). All Rights Reserved. This
* work is distributed under the W3C(r) Software License [1] in the hope that
* it will be useful, but WITHOUT ANY WARRANTY; without even the implied
* warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
*
* [1] http://www.w3.org/Consortium/Legal/2002/copyright-software-20021231