File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes XML and Related Technologies and the fly likes Regarding special characters Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Soft Skills this week in the Jobs Discussion forum!
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "Regarding special characters" Watch "Regarding special characters" New topic
Author

Regarding special characters

Nits Kulkarni
Greenhorn

Joined: Mar 23, 2006
Posts: 8
Hi All,
sample.xml
--------------------
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="#!-- #TEMPLATES:/os_files/document.xsl --#"?>
<body>
<head type="title">INTRODUCTORY NOTE</head>
<p>Bacon’s literary executor, Dr. Rowley, published "The New Atlantis" in 1627, the year after the author’s death.
</p>
</body>

i am applying DocumentBuilder for parsing above xml file. I need to read the special characters "’s" as it is, but when i read the attribute value these characters are truncated.Can you please tell how to read everything under <p> </p> tag as it is?

Thanks in advance
Nits
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18986
    
    8

Attribute value? I don't see any instances of "'s" in an attribute there. There's only one, and it's in a text node, not in an attribute.

My guess is that you are using SAX to parse this XML, and you are incorrectly assuming that the characters() method returns the text node all in one piece. But the parser is allowed to break the text node into more than one piece and call the characters() method once for each piece. Is that a correct guess?
Nits Kulkarni
Greenhorn

Joined: Mar 23, 2006
Posts: 8
Hello,

Sorry, the special characters i mentioned here has been truncated by javaranch.
I am using DOM parser. And anything i get in between <p></p> tag should be read without resolving any special characters.
Let me try to give you the special character here,it is: "’"
hp this is not truncated again.

Regards
Nitin
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18986
    
    8

Originally posted by Nits Kulkarni:
And anything i get in between <p></p> tag should be read without resolving any special characters.
Let me try to give you the special character here,it is: "’"
hp this is not truncated again.
Yes, I know when character you were talking about. An XML attribute is where you put name="value" inside the start tag of an element. It's misleading when you start talking about text nodes as "attributes".

Now, what do you mean by "resolve" there? Does your XML document contain the seven characters "&#8217;" and you don't want the parser to interpret that as a curly quote character? It would help if you could provide a clear question because this is the second guess I have made at your problem.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Regarding special characters