HTML doesn't have a rigorous set of encoding rules the way XML does. For example, in HTML, it is perfectly legal and quite common to code a <p> tag by itself with no corresponding closure.
XHTML is a form of HTML that is constrained to follow the forms of XML, and thus can be easily parsed by an XML parser, A <p> by itself would be invalid - you'd have to code <p></p> or <p/>.
Because XHTML can be processed by an XML parser, the Facelets framework can use an off-the-shelf XML processor to do its work instead of having to puzzle out the ambiguous structures that are common to HTML.
I think that in context to your specific question, the advantage of the xhtml pages is that they can be assembled into display panes using the Facelets tiling facility.
Sources may include data from the Fakebook Research Foundation with support from Gargle University