SGML

Tag Inference Examples

tag-minimization/test2

Basic omission of the start- and end-element tags for element a, as allowed by the O O tag omission indicators; as shown in the result, the missing tags are generated by SGML

tag-minimization/test3

this is an (historical) example for a corner case of tag inference from "The implementation of the Amsterdam SGML parser", fig 16, included in this selection of example to avoid false assumptions with respect to tag inference; note the DTD expects the b element both in the content model of a and d the example wouldn't validate because the c element is missing in the content for a

tag-minimization/test4

variant of test3 partially fixed by inserting <c></c> after </b>; now the SGML processor won't accept it since the c element must have a d element as child content

tag-minimization/test5

variant of test3 fixed by additionally inserting <c> around <b> however, the SGML processor rightfully balks about d not allowing end-tag omission

tag-minimization/test5a

variant of test5 where the declaration of d allows end-tag omission in addition to start-tag omission; now both <d> and </d> can be inferred

tag-minimization/test6a

variant of test5a where e takes the role of former d and d is a wrapper around e; shows that SGML will infer more than a single start-tag

tag-minimization/test7a

variant of test6a where an addtional element f may be placed before b in e; this doesn't affect the inference of b and it's container elements

tag-minimization/test10
another example from above mentioned paper; this is handled just fine by SGML
tag-minimization/test10a
also from above mentioned paper, this is rejected by the SGML processor because start-tag inference isn't considered on optional elements
tag-minimization/startend-test1 tag-minimization/startend-test2 tag-minimization/startend-test3

these examples show that tag inference won't insert a start- and end tag for a missing element if there's nothing in the file that could trigger tag inference (such as a child element tag as in the examples above, or parsed character data as in the examples below); note that the SP SGML processing software is actually able to parse test1 and test3, but will output error messages while doing so

tag-minimization/omittag-pcdata1 tag-minimization/omittag-pcdata2 tag-minimization/omittag-pcdata3

examples showing that parsed character data can also trigger tag inference; moreover, these examples show special processing of whitespace, followed by non-whitespace character data in content: handling of whitespace (including newline and tab characters) depends on whether PCDATA is accepted in the current context position without tag inference; if it is, then leading whitespace is considered part of parsed character data; otherwise, validation/modelgroup state transitions (including actions such as closing finished or opening required elements) is driven only with the portion starting PCDATA, any leading whitespace belonging to a separately reported characters event ignored for validation

tag-minimization/rank-implied-element0 tag-minimization/rank-implied-element1 tag-minimization/rank-implied-element2 tag-minimization/rank-implied-element3

examples for rank-based tag inference where generation of a Sn elements around sequence of Hn/Pn is desired; also shows variants where elements are declared with a numerical suffix, which is not equivalent to declaring an element with rank

tag-minimization/test2

<!DOCTYPE test [
	<!ELEMENT test - - (a, b)>
	<!ELEMENT a O O (a1,a2)>
	<!ELEMENT a1 - - (#PCDATA)>
	<!ELEMENT a2 - - (#PCDATA)>
	<!ELEMENT b - - (#PCDATA)>
]>
<test><a1>bla</a1><a2>fasel</a2><b>fasel</b></test>

Result

<!DOCTYPE test [
	<!ELEMENT test - - (a, b)>
	<!ELEMENT a O O (a1,a2)>
	<!ELEMENT a1 - - (#PCDATA)>
	<!ELEMENT a2 - - (#PCDATA)>
	<!ELEMENT b - - (#PCDATA)>
]>
<test><a><a1>bla</a1><a2>fasel</a2></a><b>fasel</b></test>

tag-minimization/test3

<!DOCTYPE a [
	<!ELEMENT a O - (b?, c) >
	<!ELEMENT b - - CDATA >
	<!ELEMENT c O - (d) >
	<!ELEMENT d O - (b?) >
]>
<a><b>element b</b></a>

Result


"test3.sgm": line 7: fatal: 'A': unexpected end of content model

tag-minimization/test4

<!DOCTYPE a [
	<!ELEMENT a O - (b?, c) >
	<!ELEMENT b - - CDATA >
	<!ELEMENT c O - (d) >
	<!ELEMENT d O - (b?) >
]>
<a><b>element b</b><c></c></a>

Result


"test4.sgm": line 7: fatal: 'C': element 'C' requires content

tag-minimization/test5

<!DOCTYPE a [
	<!ELEMENT a O - (b?, c) >
	<!ELEMENT b - - CDATA >
	<!ELEMENT c O - (d) >
	<!ELEMENT d O - (b?) >
]>
<a><c><b>element b</b></c></a>

Result


"test5.sgm": line 7: fatal: 'D': end-tag omission not allowed at <C>

tag-minimization/test5a

<!DOCTYPE a [
	<!ELEMENT a O - (b?, c) >
	<!ELEMENT b - - CDATA >
	<!ELEMENT c O - (d) >
	<!ELEMENT d O O (b?) >
]>
<a><c><b>element b</b></c></a>

Result

<!DOCTYPE a [
	<!ELEMENT a O - (b?, c) >
	<!ELEMENT b - - CDATA >
	<!ELEMENT c O - (d) >
	<!ELEMENT d O O (b?) >
]>
<a><c><d><b>element b</b></d></c></a>

tag-minimization/test6a

<!DOCTYPE a [
	<!ELEMENT a O - (b?, c) >
	<!ELEMENT b - - CDATA >
	<!ELEMENT c O - (d) >
	<!ELEMENT d O O (e) >
	<!ELEMENT e O O (b?) >
]>
<a><c><b>element b</b></c></a>

Result

<!DOCTYPE a [
	<!ELEMENT a O - (b?, c) >
	<!ELEMENT b - - CDATA >
	<!ELEMENT c O - (d) >
	<!ELEMENT d O O (e) >
	<!ELEMENT e O O (b?) >
]>
<a><c><d><e><b>element b</b></e></d></c></a>

tag-minimization/test7a

<!DOCTYPE a [
	<!ELEMENT a O - (b?, c) >
	<!ELEMENT b - - CDATA >
	<!ELEMENT c O - (d) >
	<!ELEMENT d O O (e) >
	<!ELEMENT e O O (f?, b?) >
	<!ELEMENT f O O CDATA >
]>
<a><c><b>element b</b></c></a>

Result

<!DOCTYPE a [
	<!ELEMENT a O - (b?, c) >
	<!ELEMENT b - - CDATA >
	<!ELEMENT c O - (d) >
	<!ELEMENT d O O (e) >
	<!ELEMENT e O O (f?, b?) >
	<!ELEMENT f O O CDATA >
]>
<a><c><d><e><b>element b</b></e></d></c></a>

tag-minimization/test10

<!DOCTYPE a [
	<!ELEMENT a O - (b?, c) >
	<!ELEMENT b O - (d) >
	<!ELEMENT (c,d) O - CDATA >
]>
<c></c></a>

Result

<!DOCTYPE a [
	<!ELEMENT a O - (b?, c) >
	<!ELEMENT b O - (d) >
	<!ELEMENT (c,d) O - CDATA >
]>
<a><c></c></a>

tag-minimization/test10a

<!DOCTYPE a [
	<!ELEMENT a O - (b?, c) >
	<!ELEMENT b O - (d) >
	<!ELEMENT (c,d) O - CDATA >
]>
<d></d></a>

Result


"test10a.sgm": line 6: fatal: element 'D' not accepted here

tag-minimization/startend-test1

<!DOCTYPE test [
	<!ELEMENT test - - (a, b, c)>
	<!ELEMENT a - - (#PCDATA)>
	<!ELEMENT b O O (#PCDATA)>
	<!ELEMENT c - - (#PCDATA)>
]>
<test><a></a><c></c></test>

Result


"startend-test1.sgm": line 7: fatal: element 'C' not accepted here

tag-minimization/startend-test2

<!DOCTYPE test [
	<!ELEMENT test - - ((a, b, c)*)>
	<!ELEMENT a O O (#PCDATA)>
	<!ELEMENT b - - (#PCDATA)>
	<!ELEMENT c - - (#PCDATA)>
]>
<test><a></a><b></b><c></c><b></b><c></c></test>

Result


"startend-test2.sgm": line 7: fatal: element 'B' not accepted here

tag-minimization/startend-test3

<!DOCTYPE test [
	<!ELEMENT test - - ((a, b, c)*)>
	<!ELEMENT a - - (#PCDATA)>
	<!ELEMENT b O O (#PCDATA)>
	<!ELEMENT c - - (#PCDATA)>
]>
<test><a></a><b></b><c></c><a></a><c></c></test>

Result


"startend-test3.sgm": line 7: fatal: element 'C' not accepted here

tag-minimization/omittag-pcdata1

<!DOCTYPE test [
	<!ELEMENT test - - (title, text)>
	<!ELEMENT title O O (#PCDATA)>
	<!ELEMENT text - - (#PCDATA)>
]>
<test>
<title>Bla bla</title>
<text>fasel</text>
</test>

Result

<!DOCTYPE test [
	<!ELEMENT test - - (title, text)>
	<!ELEMENT title O O (#PCDATA)>
	<!ELEMENT text - - (#PCDATA)>
]>
<test>
<title>Bla bla</title>
<text>fasel</text>
</test>

tag-minimization/omittag-pcdata2

<!DOCTYPE test [
	<!ELEMENT test - - (title, text)>
	<!ELEMENT title O O (#PCDATA)>
	<!ELEMENT text - - (#PCDATA)>
]>
<test>Bla bla
<text>fasel</text>
</test>

Result

<!DOCTYPE test [
	<!ELEMENT test - - (title, text)>
	<!ELEMENT title O O (#PCDATA)>
	<!ELEMENT text - - (#PCDATA)>
]>
<test><title>Bla bla
</title><text>fasel</text>
</test>

tag-minimization/omittag-pcdata3

<!DOCTYPE test [
	<!ELEMENT test - - (b, c)>
	<!ELEMENT b - O (a, #PCDATA)>
	<!ELEMENT a - - (#PCDATA)>
	<!ELEMENT c - - (#PCDATA)>
]>
<!-- tests that content model ending with PCDATA (always optional) gets
     closed when infering omitted tags -->
<test>
<b><a>bla</a><c>fasel</c>
</test>

Result

<!DOCTYPE test [
	<!ELEMENT test - - (b, c)>
	<!ELEMENT b - O (a, #PCDATA)>
	<!ELEMENT a - - (#PCDATA)>
	<!ELEMENT c - - (#PCDATA)>
]>
<test>
<b><a>bla</a></b><c>fasel</c>
</test>

tag-minimization/rank-implied-element0

<!DOCTYPE TEST [
	<!ELEMENT TEST - - (S1+)>
	<!ELEMENT S 1 O O (H1,P1,S2*)>
	<!ELEMENT S 2 O O (H2,P2)>
	<!ELEMENT (H|P) 1 - - (#PCDATA)>
	<!ELEMENT (H|P) 2 - - (#PCDATA)>
]>
<TEST>
<H1>h2.1</H1>
<P1>p2.1</P1>
<H2>h2.2</H2>
<P2>p2.2</P2>
</TEST>

Result

<!DOCTYPE TEST [
	<!ELEMENT TEST - - (S1+)>
	<!ELEMENT S 1 O O (H1,P1,S2*)>
	<!ELEMENT S 2 O O (H2,P2)>
	<!ELEMENT (H|P) 1 - - (#PCDATA)>
	<!ELEMENT (H|P) 2 - - (#PCDATA)>
]>
<TEST>
<S1><H1>h2.1</H1>
<P1>p2.1</P1>
<S2><H2>h2.2</H2>
<P2>p2.2</P2>
</S2></S1></TEST>

tag-minimization/rank-implied-element1

<!DOCTYPE TEST [
	<!ELEMENT TEST - - (S1+)>
	<!ELEMENT S1 O O (H1,P1,S2*)>
	<!ELEMENT S2 O O (H2,P2)>
	<!ELEMENT (H|P) 1 - - (#PCDATA)>
	<!ELEMENT (H|P) 2 - - (#PCDATA)>
]>
<TEST>
<H1>h2.1</H1>
<P1>p2.1</P1>
<H2>h2.2</H2>
<P2>p2.2</P2>
</TEST>

Result


"rank-implied-element1.sgm": line 11: fatal: element 'H2' not accepted here

tag-minimization/rank-implied-element2

<!DOCTYPE TEST [
	<!ELEMENT TEST - - (S1+)>
	<!ELEMENT S 1 O O (H1,P1,S2*)>
	<!ELEMENT S 2 O O (H2,P2)>
	<!ELEMENT (H1|P1|H2|P2) - - (#PCDATA)>
]>
<TEST>
<H1>h2.1</H1>
<P1>p2.1</P1>
<H2>h2.2</H2>
<P2>p2.2</P2>
</TEST>

Result


"rank-implied-element2.sgm": line 10: fatal: element 'H2' not accepted here

tag-minimization/rank-implied-element3

<!DOCTYPE TEST [
	<!ELEMENT TEST - - (S1+)>
	<!ELEMENT S 1 O O (H1,P1,S2*)>
	<!ELEMENT S 2 O O (H2,P2)>
	<!ELEMENT (H|P) 1 - - (#PCDATA)>
	<!ELEMENT (H|P) 2 - - (#PCDATA)>
]>
<TEST>
<H1>h1.1</H1>
<P1>p1.1</P1>
<H2>h2.1.1</H2>
<P2>p2.1.2</P2>
<H1>h1.2</H1>
<P1>p1.2</P1>
<H2>h2.2.2</H2>
<P2>h2.2.3</P2>
</TEST>

Result

<!DOCTYPE TEST [
	<!ELEMENT TEST - - (S1+)>
	<!ELEMENT S 1 O O (H1,P1,S2*)>
	<!ELEMENT S 2 O O (H2,P2)>
	<!ELEMENT (H|P) 1 - - (#PCDATA)>
	<!ELEMENT (H|P) 2 - - (#PCDATA)>
]>
<TEST>
<S1><H1>h1.1</H1>
<P1>p1.1</P1>
<S2><H2>h2.1.1</H2>
<P2>p2.1.2</P2>
</S2></S1><S1><H1>h1.2</H1>
<P1>p1.2</P1>
<S2><H2>h2.2.2</H2>
<P2>h2.2.3</P2>
</S2></S1></TEST>