aspose file tools*
The moose likes XML and Related Technologies and the fly likes Handling entity references by XML parsers Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "Handling entity references by XML parsers" Watch "Handling entity references by XML parsers" New topic
Author

Handling entity references by XML parsers

Dan Drillich
Ranch Hand

Joined: Jul 09, 2001
Posts: 1180
Good Day,

My beloved book "XML in a nutshell" of O'Reilly says (on page #18) that XML defines five entity references -

- the less-than sign
- the ampersand
- the greater-than sign
- the straight, double quotation marks
- the apostrophe, or single quote

It says that these entity references & a m p ; and & l t ; are considered markup and when an application parses an XML document, it replaces this particular markup with the actual characters the entity reference refers to. It also says that in addition to these five predefined entity references, you can define others in the document type definition.

So my question is - does it mean that all other entity references in the XML document are left intact by the parsers?

Regards,
Dan

William Butler Yeats: All life is a preparation for something that probably will never happen. Unless you make it happen.
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18675
    
    8

No. If a parser encounters an undeclared entity reference it will throw an exception.
Dan Drillich
Ranch Hand

Joined: Jul 09, 2001
Posts: 1180
Thank you Paul.

Right, but what about all the "standard" HTML Escape Sequences, such é - & eacute ; , ö - & ouml ; , ò - & ograve ; , ñ - & ntilde ; , etc. ?

Regards,
Dan
Dan Drillich
Ranch Hand

Joined: Jul 09, 2001
Posts: 1180
Paul,

I guess you are absolutely right! I put one of these entities in a valid XML file and tried to open it with Firefox and IE. Both didn't do it. Firefox even said -

XML Parsing Error: undefined entity


Regards,
Dan
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18675
    
    8

Yup. HTML is not an XML dialect. (Although XHTML is... you will notice that an XHTML document contains a DTD reference at the top.)
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Handling entity references by XML parsers