DOCTYPE Declarations

The first thing that should be in your HTML files is a DOCTYPE declaration.  In the past, I have simply copied them when I have used them at all, without really thinking about what was in them.  Today, I have decided to take a closer look.  At first glance at the W3C Recommended list of Doctype declarations page, it would seem that simply copying a standard DOCTYPE is the thing to do if you are writing an HTML page.  The page states that exact spelling and case are important.

There is a wide range of possible standard DOCTYPE declarations available.  For HTML 4.01, there are three: Strict, Traditional, and Frameset.

If you leave out your DOCTYPE declaration, the default is HTML 4.01 Strict Document Type, according to W3C.  This one disallows deprecated elements and attributes.  It “…excludes the presentation attributes and elements the World Wide Web Consortium (W3C) expects to phase out as support for style sheets matures.”  It might be a good idea to use Strict elements and attributes if you would like to eventually transition to HTML 5, which is as yet not a standard.  You can determine which elements are deprecated by checking the Index of Elements page.  Likewise, deprecated attributes can be identified on the Index of Attributes page.  Look in the Depr. Column for a D.  The Strict DOCTYPE declaration is as follows:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
     "http://www.w3.org/TR/html4/strict.dtd">

If you need to accommodate deprecated elements and attributes, you can use the HTML 4.01 Transitional Document Type.  This one allows you to use deprecated elements and attributes but does not allow frames.  The Transitional DOCTYPE declaration is as follows:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
     "http://www.w3.org/TR/html4/loose.dtd">

If you are determined to use frames, then you can use the Frameset declaration as follows:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"
     "http://www.w3.org/TR/html4/frameset.dtd">

I noticed that frames and deprecated elements and attributes worked on my test pages even when I declared the pages as Strict.  If you really want to know if your pages conform to your chosen type, you can validate your HTML files using the W3C Markup Validation Service.  One note about using this service:  In addition to requiring the DOCTYPE declaration, it will also flag an error if you fail to declare your character set.  You can do that by including the following in your HTML file:

<HEAD>
<META http-equiv="content-type"
     content="text/html;charset=utf-8">
</HEAD>

Now let’s tease out what the various parts of the DOCTYPE declaration really mean.  The general syntax for DOCTYPE is

<!DOCTYPE root-element PUBLIC "FPI" ["URI"] [
     <!-- internal subset declarations -->
]>

or

<!DOCTYPE root-element SYSTEM "URI" [
     <!-- internal subset declarations -->
]>

If you use HTML as the root-element and don’t declare anything else, you will invoke the HTML 5 version of the DTD, which is as yet non-standard.  Some other standard root-elements include MATH and SVG.  When DOCTYPE declarations are used in XML files, you can declare a custom root-element.  Like <HTML>, this custom root-element will surround the contents of the page.  For instance, you may have the following:

<?xml version="1.0" standalone="no" ?>
<!DOCTYPE memo SYSTEM "memo.dtd">
<memo>
</memo>

PUBLIC or SYSTEM declares where the definition of the document type resides.  If it is PUBLIC, then it is a publicly accessible object.  This is the default.  Those standard HTML DOCTYPE declarations are all PUBLIC.  Their definitions reside where anyone can get to them.  If SYSTEM is declared, then the document type is a system resource, such as a local file or URL.

The FPI used with a PUBLIC declaration is a string which identifies the owner and the particular document identified by the FPI.  In the case of "-//W3C//DTD HTML 4.01//EN", the owner is "-//W3C", and the document is "DTD HTML 4.01//EN".  FPI is not used with SYSTEM.

The URI identifies where a Document Type Definition (DTD) file defining the document type is located.  Describing the contents of this file is beyond the scope of this post.  Information about DTD can be found in the Wikipedia post, Document Type Definition.

DOCTYPE declarations also allow DTD statements to be placed directly into the declaration.  The following are examples of this:

<?xml version="1.0" standalone="yes" ?>
<!DOCTYPE foo [
     <!ELEMENT foo (#PCDATA)>
]>
<foo>Hello World.</foo>

and

<?xml version="1.0" standalone="no" ?>
<!DOCTYPE document SYSTEM "subjects.dtd" [
     <!ATTLIST assessment assessment_type (exam | assignment | prac)>
     <!ELEMENT results (#PCDATA)>
]>

References:
http://www.w3.org/QA/2002/04/valid-dtd-list.html
http://www.w3.org/TR/html401/struct/global.html#idx-document_type_declaration-3
http://www.w3.org/TR/html401/index/elements.html
http://www.w3.org/TR/html401/index/attributes.html
http://validator.w3.org/
http://en.wikipedia.org/wiki/Doctype
http://xmlwriter.net/xml_guide/doctype_declaration.shtml
http://msdn.microsoft.com/en-us/library/ie/ms535242(v=vs.85).aspx
http://en.wikipedia.org/wiki/Document_Type_Definition

This entry was posted in HTML, XML and tagged , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.