What doctype should I use for XHTML documents? The short answer is that I use the following doctype in all my web pages:
I also recommend using an HTML and CSS validator that ignores the doctype in my HTML documents and lets me choose what I want to validate. That way, I can use this doctype to trigger the correct browser rendering while including non-standard elements and attributes in my markup so I can take advantage of browser innovations in both HTML markup and CSS properties.
Here is the long answer...
This doctype specifies the type of document as "html" with a version of "XHTML 1.0 Transitional". It also specifies the location of the Document Type Definition (DTD) file that should be used for validating the document. In this doctype the location is the W3C website at "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd".
Doctype is related to the MIME content type, but has a completely different purpose. The MIME content type
alone defines the type of document. It is set by the web server using the HTTP protocol when a document is downloaded. In spite of its name, a doctype has nothing to do with the identifying the type of a document.
A browser uses the MIME content type to determine what "driver" it will use to parse and display the document. In other words, if you use a MIME content type of "text/html", a browser will load its HTML parser and rendering engine. If you use a MIME content type of "application/xhtml+xml", it will load its XML parser and rendering engine. In other words, a browser will not even try to parse and render a document until it knows what type of document it is, and that information comes
exclusively from the MIME type.
So what is a doctype and what is it used for? A doctype is an element embedded inside an the head of an HTML or XHTML document. The doctype specifies the version of an HTML document and what Document Type Document (DTD) should be used to validate the document.
The doctype comes from the SGML roots of the HTML and XHTML specifications. SGML is the language used to define the HTML, XML, and XHTML languages. SGML is a powerful and complex language for defining markup languages.
SGML parsers look for a doctype element inside a document so they can load a DTD file to validate and parse the document. When an SGML parser encounters a doctype, it can use the DTD to determine the rules it should use for parsing and validating the document.
In the doctype I listed above, an SGML parser would go to
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd and retrieve the DTD file and use it to parse and validate the document. You can specify any DTD file, and it can be located anywhere as long as it can be downloaded. The DTD can be on your local hard drive, on your own website, on someone else's website, etc.
Validator programs typically read the doctype, retrieve the DTD and use the DTD to validate your document. The W3C has a free validator at
http://validator.w3.org/ It reads the doctype of your document and validates it based on that doctype.
You can even download the DTD files supplied by the W3C and modify them to create your own rules for validating HTML and XHTML documents. In your doctypes, you can specify your own custom file as the DTD that should be used for validation. This is a perfectly appropriate technique. This is the original purpose of the doctype. If you are a programmer, DTDs are not hard to understand. Any good book on XML will show you how to create a DTD. All you need is a validator program, such as XML spy. You can specify any DTD in your doctype and XML spy will retrieve it from the location specified in the doctype and it will validate your document using it.
For example, there are many HTML and XHTML elements and attributes that the W3C has deprecated. Most of these are deprecated for good reasons and I don't recommend using them. But a few elements and attributes are fully supported by browsers and should not have been deprecated or were never included in the W3C specifications for political reasons. There is nothing wrong with including these elements and attributes in your documents. By using custom DTDs, you can use these elements and attributes in your documents and still validate your documents. Of course, be sure to test non-standard elements and attributes in all browsers to be sure they work the way you want.
In other words, a valid document is valid when it validates against a DTD � any DTD specified by the doctype. It doesn't have to validate against the W3C's DTDs. This is a very important point that the W3C doesn't advertise because they want you to use their standard because they want to be the only standard!
Don't get me wrong. I like the W3C and I like standards, but the W3C is not the only standard. HTML and XML are based on DTDs that can be modified beyond the W3C standard. The Internet succeeded because of the right balance of basic, open standards, and the freedom for anyone to extend these standards.
In fact the Internet is not as exciting as it used to be because the W3C is not in the business of innovating. Its business is standardizing, which by its nature opposes change. Browser vendors did the innovating and the W3C followed behind trying to set a standard upon which they all could agree � the process was political and was full of compromises. Now that Microsoft has conquered the browser market and killed Netscape, innovation is stagnant. Mozilla is not a standard setter, but a standard follower. Once we realize that standards are only a starting point, we can start genuinely innovating again.
What makes things complicated with the doctype is that web browsers are not SGML parsers. They don't follow SGML rules. They don't look up the DTD file at the location specified in the doctype. They don't use the DTD to validate documents. Instead, they use their own rules for parsing HTML -- and each browser vendor has different rules. Until CSS entered the picture, browsers completely ignored the doctype!
Unfortunately, during the early days of the Internet browser vendors rushed out new versions faster than the W3C could define specifications. Different browsers implemented early draft versions of CSS specifications differently. Once CSS became standardized, browser vendors needed a way to know which rules to use when rendering a document: the old HTML rules, the new CSS rules, or rules that were backward compatible to some proprietary vendor standard.
The browser vendors decided to use the doctype for this purpose. This is called doctype sniffing. This was a mistake. It is a kludge of the worst kind, but we are stuck with it.
There are dozens of different doctype signatures and all the different browser vendors use different signatures to trigger different CSS behaviors. You can read about these signatures at
http://en.wikipedia.org/wiki/Comparison_of_layout_engines_(HTML) After years of research and experimentation, I have found only one doctype that reliably triggers compatible behavior among all the browsers. It is the doctype I listed previously. All the design patterns in my book use this doctype.
If you want to use HTML instead of XHTML (which I don't recommend), you can use either of the following HTML doctypes:
Further, if you plan on using elements and attributes that are not supported by XHTML 1.0 Transitional, you can use an HTML validator that ignores the doctype in your documents and lets you choose what you want to validate. This way, you can choose the doctype that triggers the right CSS behavior in browsers and allows you to include non-standard elements and attributes in your markup. This will free you to take advantage of browser innovations in both HTML markup and CSS properties.
Lastly, XML has an untapped power for document markup: it allows you to use your own custom elements and attributes to extend the semantic and structural meaning of your markup. The real power of CSS is that it allows you to style your custom markup as you please. This approach doesn't yet work consistently in all web browsers, but I can't wait until it does!
Chapter 2 of my book, Pro CSS and HTML Design Patterns, on pages 38-40 also discusses doctypes in a simpler and more practical manner. You can also see an example at
http://www.cssdesignpatterns.com/Chapter%2002%20-%20HTML/DOCTYPE/example.html [ October 13, 2007: Message edited by: Mike Bowers ]