• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Tim Cooke
  • Liutauras Vilda
  • Jeanne Boyarsky
Sheriffs:
  • Paul Clapham
  • Rob Spoor
  • Junilu Lacar
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Piet Souris
  • Carey Brown
Bartenders:

character method of ContentHandler

 
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi everybody,
How does the character method of ContentHandler work ?
I am parsing a simple xml document using the SAX parser :

<?xml version='1.0' encoding='utf-8'?>
<bookstore>
<book>
<title> THE GREATNESS GUIDE </title>
<author> Robin Sharma</author>
</book>
<book>
<title>ONE NIGHT AT THE CALL CENTER </title>
<author>Shyam Bhagwat </author>
</book>
</bookstore>



public class MySAXHandler2 extends DefaultHandler
{

public void startElement(String uri ,String localname ,String qName ,Attributes attr) throws SAXException
{
System.out.println("<" + qName + ">") ;
}

public void endElement(String uri ,String localName ,String qName) throws SAXException
{
System.out.println("</" + qName + ">") ;
}

public void characters(char [] ch ,int start ,int length) throws SAXException
{
System.out.println(" *" + new String(ch) ) ;//new String(ch,start,length) ) ; line *****
}
}

The output for this is :
---------- interpreter ----------
<bookstore>
*
<bookstore>
<book>
<title> THE GREATNESS GUIDE </title>
<author> Robin Sharma</author>
</book>
<book>
<title>ONE NIGHT AT THE CALL CENTER </title>
<author>Shyam Bhagwat </author>
</book>
</bookstore>
Output completed (0 sec consumed)

Now if I simply replace new String(ch) in line ***** with new String(ch,start,length) I get a proper output :

---------- interpreter ----------
<bookstore>
*

<book>
*

<title>
* THE GREATNESS GUIDE
</title>
*

<author>
* Robin Sharma
</author>
*

</book>
*

<book>
*

<title>
*ONE NIGHT AT THE CALL CENTER
</title>
*

<author>
*Shyam Bhagwat
</author>
*

</book>
*

</bookstore>

Output completed (0 sec consumed) - Normal Termination

Why So??

Cheers,
Poonam.
 
Rancher
Posts: 43045
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'm assuming you're asking about why you're getting those empty lines. The answer is that white space is significant in XML, and line breaks (which you have in your XML file) constitute white space.
 
Poonam Kadu
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Ulf,
Actually the difference between the 2 outputs is that in the first output once the <bookstore> element is encountered the startElement method is invoked,after this the character method is called which prints the entire xml document and the execution stops without calling any further methods.

Whereas in the 2nd output all the methods(startElement,endElement,character) are called in a sequence.
This happened only because in the first version the character method looked like:

public void characters(char [] ch ,int start ,int length) throws SAXException
{
System.out.println( new String(ch) ) ;
}

and in the 2nd version it looks like :
public void characters(char [] ch ,int start ,int length) throws SAXException
{
System.out.println( new String(ch,start,length) ) ;
}

How is ch initialized ...and what is the significance of start and length?

Cheers,
Poonam.


reply
    Bookmark Topic Watch Topic
  • New Topic