This week's book giveaway is in the OO, Patterns, UML and Refactoring forum. We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line! See this thread for details.
Hi All, sample.xml -------------------- <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="#!-- #TEMPLATES:/os_files/document.xsl --#"?> <body> <head type="title">INTRODUCTORY NOTE</head> <p>Bacon’s literary executor, Dr. Rowley, published "The New Atlantis" in 1627, the year after the author’s death. </p> </body>
i am applying DocumentBuilder for parsing above xml file. I need to read the special characters "’s" as it is, but when i read the attribute value these characters are truncated.Can you please tell how to read everything under <p> </p> tag as it is?
Attribute value? I don't see any instances of "'s" in an attribute there. There's only one, and it's in a text node, not in an attribute.
My guess is that you are using SAX to parse this XML, and you are incorrectly assuming that the characters() method returns the text node all in one piece. But the parser is allowed to break the text node into more than one piece and call the characters() method once for each piece. Is that a correct guess?
Joined: Mar 23, 2006
Sorry, the special characters i mentioned here has been truncated by javaranch. I am using DOM parser. And anything i get in between <p></p> tag should be read without resolving any special characters. Let me try to give you the special character here,it is: "’" hp this is not truncated again.
Originally posted by Nits Kulkarni: And anything i get in between <p></p> tag should be read without resolving any special characters. Let me try to give you the special character here,it is: "’" hp this is not truncated again.
Yes, I know when character you were talking about. An XML attribute is where you put name="value" inside the start tag of an element. It's misleading when you start talking about text nodes as "attributes".
Now, what do you mean by "resolve" there? Does your XML document contain the seven characters "’" and you don't want the parser to interpret that as a curly quote character? It would help if you could provide a clear question because this is the second guess I have made at your problem.
I’ve looked at a lot of different solutions, and in my humble opinion Aspose is the way to go. Here’s the link: http://aspose.com