In an interview I was asked a question that I\'d never thought about, which was \"We already have HTML which fulfills all the requirements of writing a web page, so what\'s
In addition to Johannes answer, HTML is far too loose in its interpretations and tolerance, where XHTML's strict formalisation negates this.
Tolerance leads to variance, which leads to browser incompatibilities, which leads to the dark side.
I am sure you mustve encountered this article from W3.There is a lot to learn from that article. In short XHTML abides the xml rules besides having HTML set of tags. The Most Important Differences:
* XHTML elements must be properly nested
* XHTML elements must always be closed
* XHTML elements must be in lowercase
* XHTML documents must have one root element
XHTML is an attempt to encourage the development of "well-formed" HTML.
HTML has evolved over more than 10 years. Its implementation, and the implementation of the browsers that parse and render it, are not exactly consistent. This is why cross-browser compatibility is a major headache.
HTML is based on SGML (Standard Generalized Markup Language.) XML is also derived from SGML, so they are cousins of a sort. XHTML marries the two, providing (in theory) the benefits of XML to HTML. This includes a well-defined schema that can be reliably validated, queried, and transformed.
XHTML forces you to write cleaner code which is easier to maintain, renders more consistently, and easier to hook into the DOM. Comparing XHTML to HTML is like comparing a programming language that is strongly-typed to a programming language that is loosely-typed.
As mentioned above, XHTML allows you to play with SVG and MathML. I'd like to add RDFa to that list. RDFa allows you to add semantics to your content that is not covered by microformats. I've personally been doing a lot with Dublin Core and Friend-of-a-Friend.
XML is a data interchange format - this is perfect for building websites because after all we are dealing with information and this info needs to be crawled and understood by computers (such as search engines).
I see a bunch of up-voted answers here that are making incorrect assumptions about how browsers work. So let me give my 2 cents on the matter.
First of all, why does XHTML exist?
From the horse's mouth:
a two-day workshop was organised to discuss whether a new version of HTML in XML was needed. The opinion at the workshop was a clear 'Yes': with an XML-based HTML other XML languages could include bits of XHTML, and XHTML documents could include bits of other markup languages. We could also take advantage of the redesign to clean up some of the more untidy parts of HTML, and add some new needed functionality, like better forms.
In short, XHTML was created for two reasons:
Making things easier to validate was not a design goal, and also not something that was necessary because HTML4 validators exist and are comprehensive.
Is XHTML easier to parse for browsers?
Yes and no. XML is easier to parse than HTML tag soup, but, unless you use an xhtml+xml or application/xml mime type for your XHTML page, browsers parse it using the HTML parsing engine. However, if you do use xml mime types, IE chokes on your content. This behavior is explained on the IE blog. There is no difference in how browsers treat XHTML and HTML if you are serving it with a mime type of text/html!
Yes they do! You lie!
Indeed they do, but only because of the doctype. Browsers use doctypes at the top of HTML documents to determine whether they should use standards mode or quirks mode (= bugs mode). All valid XHTML documents happen to include a doctype that triggers standards mode. However, in HTML you can get the same result by including "<!doctype html>" at the top of your page.
So are you saying XHTML has no purpose?
Not at all. XHTML has many advantages:
So, I should use it then?
As always, the answer is "it depends".
What about HTML5? Does it compete with XHTML?
No it doesn't. HTML5 has two serializations, one as HTML, and one as XML. The benefit is that both now have strict parsing rules. You will get predictable behavior in all browsers regardless of the approach you use. However, HTML5 parsed as HTML has the benefit of graceful error handling. That's why I prefer that approach. As always, YMMV.