SGML

API Reference

Callback interface for receiving general markup events according to the SAX 1.0 specification.

Consuming SAX events from SGML is a matter of setting up an event source and then implement event handlers. A basic application typically will be implementing the startElement(), endElement(), and characters() callbacks while supplying empty method bodies for the startDocument(), endDocument(), and processingInstruction() callbacks. The application then needs to register its DocumentHandler implementation with an event source such as shown in Parser.

Note the use of the SAX 1.0 DocumentHandler interface is deliberately chosen over the SAX 2.0 ContentHandler interface for SGML because ContentHandler was specifically introduced to forward XML namespace prefixes which aren't however used with SGML. Instead, when parsing XML using SGML, normalized namespace prefixes are forwarded as part of an element or attribute name. Likewise, start and end of prefix mappings (prefix scopes) aren't reported via the specialized startPrefixMapping and endPrefixMapping events; instead, namespace mappings are forwared as part of regular startElement events (with the attributes hashmap containing keys beginning with xmlns:).

Moreover, the SAX 2.0 Attributes interface isn't supported in startElement events, using a plain JavaScript hash map representation closer to the original SAX 1.0 AttributeList interface instead. This, too, represents a design choice, as the SAX 2.0 AttributeList interface exposes whether a given attribute value is specified explicitly in content (ie. as opposed to being implied by a DTD or other schema language). This distinction is deemed not desirable for SGML to expose via SAX; instead, attributes delivered via SAX are expected to be treated exactly in the same way, whether they are specified or implied using SGML authoring features.

See:
http://www.saxproject.org/apidoc/org/xml/sax/DocumentHandler.html

Name Description
characters Called on character data appearing as content to the top-most element.
endDocument Called once as the last event for the document being parsed.
endElement Called when a end-element tag is specified in input or an omitted end-element tag is implied.
processingInstruction Called on a processing instruction in content.
startDocument Called once for every document being parsed, before every other call for that document.
startElement Called on a start-element tag in the content stream.

Member Details

characters(text)

Called on character data appearing as content to the top-most element.

May be invoked multiple times on consecutive lines of text.

Parameters

Name Type Description
text string

the character data

endDocument()

Called once as the last event for the document being parsed.

endElement(name)

Called when a end-element tag is specified in input or an omitted end-element tag is implied.

If the element being parsed has no content (is empty), will be called immediately after the corresponding startElement() call.

Parameters

Name Type Description
name string

of the ended element

processingInstruction(name, data)

Called on a processing instruction in content.

Isn't called on processing instructions in declaration sets (DTDs, LPDS); instead, those processing instructions are received as part of a startDtd event via DtdHandler.

Parameters

Name Type Description
name string

part of the processing instruction code text up to (not including) an initial whitespace character (or the complete code text of the processing instruction, if it doesn't contain whitespace)

data string

part of the processing instruction code text following after initial whitespace, if any

startDocument()

Called once for every document being parsed, before every other call for that document.

Note this is only ever called once as multiple input files aren't supported.

startElement(name, atts, attrs): string|undefined

Called on a start-element tag in the content stream.

On empty-element tags, endElement will be invoked immediately after the call has returned.

Parameters

Name Type Description
name string

name of the started element

atts Object.<string, string>

name/value map of specified or implied attributes

attrs string

string serialization of atts

Returns

string | undefined

the return value can be used to pass back custom info to the caller