The Full WHATWG HTML RD2001 DTD, like former versions, is a transcription of WHATWG's HTML Review Draft specification prose published January 29th, 2020, into an SGML DTD. The Full DTD covers all elements of HTML, SVG, MathML, and the ARIA attributes, and its construction is described in the reference for the W3C HTML 5 DTD, with only modifications for the current version described in this document.
The Minimal WHATWG HTML RD2001 DTD,
also like former versions, is a compact DTD containing
only essential parsing rules for HTML.
As only HTML's special rules for HTML void elements and
enumerated attributes are included (others being admitted
freely), the Minimal WHATWG HTML's DTD
usefulness for validation purposes is limited. Instead, the
purpose of the Minimal HTML DTD is to provide a
minimal bundled declaration set for content parsing and
production tasks for modern and idiomatic HTML in sgmljs.net
and other SGML software with support for resolving
declaration sets via catalog resolution (in sgmljs.net,
the Minimal HTML DTD is resolved and accessed by
about:legacy-compat system identifier).
This DTD is based on HTML review draft 20-01 published as a W3C recommendation on January 28, 2021, which is the first (and, so far, only) W3C recommendation based on a WHATWG HTML review draft, and the first W3C recommendation since 2017.
Apart from a larger set of small changes to be expected for the first revision since years as explained below, Review Draft 200229, accepted as W3C HTML recommendation, is also the first W3C HTML specification published under the Memorandum of Understanding between WHATWG and W3C which prevents W3C from directly redacting specification text. As such, HTML Review Draft 200219 sees notable change in two long standing issues where upstream (WHATWG) HTML specification text was accepted when it was explicitly rejected in previous W3C versions despite lack of material change:
main elements are allowed, reflected by the
aside content models now not forbidding
hgroup was included in W3C HTML for the first time; note
in WHATWG HTML,
hgroup, as orginally introduced for hiding headings
having multiple ranks from the so-called HTML 5 outlining algorithm to
prevent inference of undesired sections, had been deprecated
for many years, even though its content model specification
hasn't changed (which has been the the reason of the W3C editors
for not including it);
hgroup's content model is only changed
in the upcoming Review Draft 2023
The changes are detailed in the following sections.
slot elements (complementing the
element already part of previous HTML specifications).
slot elements were added
flow content category and parameter entity;
slot were also added to the
phrasing category and
parameter entity, resp., while
hgroup was added to the
heading parameter entities.
menu element has been re-introduced with changed content
rules and semantics; it is being listed under grouping elements,
and has been removed as a legacy element. Note the
element that used to be part of the original
model isn't anymore used at all but remains present as a
legacy element since it admits end-element tag omission.
style element has been removed from the flow content
category, reflecting final abandon of the scoped CSS concept
in HTML specs.
object, and the legacy
keygen element, have
been made member of the interactive content category.
Changed content models or inclusion or exclusion constraints
, andcanvas` elements.
Note that heading elements as content of
legend elements were
valid before Review Draft 200129, and are valid in current WHATWG
specifications again, hence their disallowance in Review Draft 200129
can be considered erratic. To use the declaration of eg. Review Draft
230116 instead, you can place the following markup declarations into
the internal subset:
<!ENTITY html.legend.element "IGNORE"> <!ELEMENT legend - - (#PCDATA|%phrasing;|%heading;)* -(main)>
rtc elements (removed from the specification but
allowing tag omission) as legacy elements.
address element now appears under sectioning when it
would formerly be listed under grouping cortent.
Added event handler attributes
Removed event handler attributes
Added global attributes
slot. Also added
nonce as global attribute where it used to be declared for specific
elements in previous DTDs.
Added body event handler attribute
Note that the
spellcheck, and the
translate global attributes can have the
empty string as value even though the HTML spec advises to
not specifying the attribute in these cases in the first place.
This is not reflected in the SGML DTD.
The same is true of the Fetch API destination (
Section 4.2.4) and the CORS settings (
crossorigin) attributes (defined by
the Fetch Spec and the referrer policy (
attribute (defined by the Referrer Policy spec). These
two specifications have no versioning (not even equivalent to a
Public Review Draft), nor other formal alignment with the HTML
specification, and also contain wildly non-normative language,
and thus, while their snapshot values at the time of publication
can be conditionally included via parameter entities, aren't
included in the HTML DTD by default.
rev attribute on the
longdesc attribute on the
typemustmatch attribute on the
hreflang attribute on the
autofocus element has been formally made applicable to all
HTML elements in WHATWG HTML (section 6.6.7) where it was defined only
in the context of form controls in previous revisions; this is
reflected by promoting
autofocus as global attribute.
border attribute on the
charset attribute on the
usemap attribute on
object element; note the
usemap attribute is removed again in the next review draft
(see object-usemap) along with content model changes.
color attributes on the
ping attribute (as a CDATA attribute) on the
decoding attribute to the
loading attribute to the
playsinline attribute to the
rel attribute to the
nomodule attribute to the
The enumerated values for the
(section 188.8.131.52) are now represented in the DTD.
height attributes (on the
canvas elements and the
input element) to have
NUMBER declared value.
sandbox on the
allows multiple space-separated values hence has been remodelled
as having declared value
The enumerated values for the
(section 4.10.3) and the
type attribute on the
element are now represented in the DTD.
In previous DTDs, the ARIA
role attribute wasn't actually declared
(only attributes for ARIA states and properties were). This has been
fixed. Note unlike
tabindex attribute is, and has always
been, declared as part of HTML. Note this was fixed in the W3C HTML
5.2 DTD as well.
Moreover, the integration of ARIA has been changed such that
declared attribute defaults for ARIA state and property attributes
are customized to become
#IMPLIED ie. have no material default
value specified. This is in line with what's done with HTML attribute
defaults where applicable, and due to the expectation that an SGML
processor adds default values for attributes where those are declared,
which is however in conflict with HTML's and ARIA's expectation that
an attribute taking on its default value should be left unspecified.
While this change isn't a fix per se, it has been applied to the
previous HTML DTD (W3C HTML 5.2, but no prior versions) as well.
In previous versions, exclusion exceptions for the
had been placed on
legend elements when they should only
apply to sectioning elements with explicit exclusion of
main such as
main itself doesn't exclude
descendants in its content model. Note this fix has been applied to
the previous HTML DTD (W3C HTML 5.2, but no prior versions) as well.
The HTML Review Draft specification states that
User agents that implement SVG must implement the SVG 2 specification, and not any earlier revisisions.
The SVG working group at W3C hasn't published a formal specification for SVG 2 as language in the form of a DTD or RelaxNG grammar, like was done for previous versions. Moreover, the SVG 2 specification is at candidate recommendation stage at this time, and has been since 2018, reflecting uncertainty regarding whether proposed recommendation or recommendation status can be reached eventually, considering browser vendors have voiced interest in supporting very few conservative SVG 2 additions (such as for streamlining SVG/CSS integration), but not committed to new SVG 2 features as a whole, while continued existence of the SVG working group per its charter, and even W3C as its hosting organization isn't guaranteed.
In keeping with previous HTML 5.x DTDs, the (extremely modular) SVG 1.1 DTD is further extended for SVG 2, but only with those features that are also accepted and implemented for the SVG subset recognized by W3C's nu validator (the SVG RelaxNG grammar used internally by the nu validator is also derived from the SVG 1.1 DTD we're customizing here), up to changes made until May 25th, 2021. Specifically, the following customizations are applied:
feDropShadow as element and filter primitive
(in line with section 184.108.40.206's listing of
mapped camel-case element names for SVG; note
was defined as part of SVG Filter Effect Module Level 1, hence
as part of SVG 1.* rather than SVG 2
additional enumerated values for the
operator attribute on
feComposite elements, the
mode attribute on
elements, and declaration of the
symbol elements (note the nu validator only adds
note the SVG
desc element remains unchanged (isn't changed
to allow any child content)
Moreover, the HTML specification makes the specific requirements that
the content model for the SVG
title element inside HTML documents is
phrasing content (this further constrains the requirements given in
SVG 2) (section 4.8.17)
svg element falls into the embedded content, phrasing content,
flow content [and palpable content] categories for the purposes of
the content models in this specification (section 4.8.17)
when the SVG
foreignObject element contains elements from the HTML
namespace, such elements must all be flow content
HTML defines the
nonce attribute applying to SVG and other foreign
elements (section 2.6.6)
which have been applied as well.
Finally, generic XML attributes in need of declaration within an SGML
id, including their no-namespace
HTML variants if applicable) are declared (see section 220.127.116.11).
Note XLink attributes are declared by the SVG 1.1 DTD (see also section 18.104.22.168).
Customization of MathML 3 DTD for embedding into HTML includes the following specific requirements:
When the MathML annotation-xml element contains elements from the HTML namespace, such elements must all be flow content" (section 4.8.16)
When the MathML token elements (mi, mo, mn, ns, and mtext) are descendants of HTML elements, they may contain phrasing content elements from the HTML namespace (section 4.8.16)
Finally, like with SVG, generic XML attributes in need of declaring
no-namespace HTML variants for
xml:space are declared.