Win a copy of Re-engineering Legacy Software this week in the Refactoring forum
or Docker in Action in the Cloud/Virtualization forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

character method of ContentHandler

 
Poonam Kadu
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi everybody,
How does the character method of ContentHandler work ?
I am parsing a simple xml document using the SAX parser :

<?xml version='1.0' encoding='utf-8'?>
<bookstore>
<book>
<title> THE GREATNESS GUIDE </title>
<author> Robin Sharma</author>
</book>
<book>
<title>ONE NIGHT AT THE CALL CENTER </title>
<author>Shyam Bhagwat </author>
</book>
</bookstore>



public class MySAXHandler2 extends DefaultHandler
{

public void startElement(String uri ,String localname ,String qName ,Attributes attr) throws SAXException
{
System.out.println("<" + qName + ">") ;
}

public void endElement(String uri ,String localName ,String qName) throws SAXException
{
System.out.println("</" + qName + ">") ;
}

public void characters(char [] ch ,int start ,int length) throws SAXException
{
System.out.println(" *" + new String(ch) ) ;//new String(ch,start,length) ) ; line *****
}
}

The output for this is :
---------- interpreter ----------
<bookstore>
*
<bookstore>
<book>
<title> THE GREATNESS GUIDE </title>
<author> Robin Sharma</author>
</book>
<book>
<title>ONE NIGHT AT THE CALL CENTER </title>
<author>Shyam Bhagwat </author>
</book>
</bookstore>
Output completed (0 sec consumed)

Now if I simply replace new String(ch) in line ***** with new String(ch,start,length) I get a proper output :

---------- interpreter ----------
<bookstore>
*

<book>
*

<title>
* THE GREATNESS GUIDE
</title>
*

<author>
* Robin Sharma
</author>
*

</book>
*

<book>
*

<title>
*ONE NIGHT AT THE CALL CENTER
</title>
*

<author>
*Shyam Bhagwat
</author>
*

</book>
*

</bookstore>

Output completed (0 sec consumed) - Normal Termination

Why So??

Cheers,
Poonam.
 
Ulf Dittmer
Rancher
Pie
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm assuming you're asking about why you're getting those empty lines. The answer is that white space is significant in XML, and line breaks (which you have in your XML file) constitute white space.
 
Poonam Kadu
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Ulf,
Actually the difference between the 2 outputs is that in the first output once the <bookstore> element is encountered the startElement method is invoked,after this the character method is called which prints the entire xml document and the execution stops without calling any further methods.

Whereas in the 2nd output all the methods(startElement,endElement,character) are called in a sequence.
This happened only because in the first version the character method looked like:

public void characters(char [] ch ,int start ,int length) throws SAXException
{
System.out.println( new String(ch) ) ;
}

and in the 2nd version it looks like :
public void characters(char [] ch ,int start ,int length) throws SAXException
{
System.out.println( new String(ch,start,length) ) ;
}

How is ch initialized ...and what is the significance of start and length?

Cheers,
Poonam.


 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic