Bookmark Topic Watch Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Report post to moderator
This is the FAQ page for the XML and Related Technologies forum. Contributions are welcome. Also see XmlLinks.


Q: The characters() method in my SAX parser doesn't return all the text (or is called more than once). What gives?

Here's what the javadocs of that method say: SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks. William Brogden explains :

The characters() method may be called any number of times within a single element because the SAX parser only handles one bufferload of input characters at a time. It is up to the programmer to assemble the text properly.

I normally have a StringBuffer or StringBuilder reference that gets a new instance when the appropriate startElement() is hit and gets additions from each call to the characters() method. When endElement() occurs I use toString() to get the assembled characters and then work on the logic.

JavaDoc:org.xml.sax.ContentHandler


Java Code Examples


  • HowToValidateXmlAgainstSchema
  • HowToValidateXmlAgainstAnySchema (or DTD or Relax-NG)
  • HowToPrettyPrintXmlWithXsl
  • HowToPrettyPrintXmlWithJava
  • DocumentToFile
  • DocumentToString
  • DocumentToByteArray
  • StringToDocument
  • ByteArrayToDocument
  • GetElementValueByNameUsingDom
  • GetNodeValue




  • Articles and introductions

    General


  • Introduction to XML
  • A Technical Introduction to XML


  • Specifically about Java


  • JAXP is the Java standard for XML processing; it is part of the JRE.
  • Introduction to XML processing in Java
  • Introduction to DOM and SAX Parsing
  • Introduction to XML and XML processing in Java
  • Using JAXP to process XML
  • Load, Save and Filter XML Documents Using the DOM Level 3 API
  • Unofficial JAXP FAQ
  • JAXP trail in the Oracle Java Tutorial
  • An Introduction to StAX (Streaming API for XML) (new in JAXP 1.4 and Java 6)
  • XQJ - a standard API for XQuery processing in Java



  • Software

  • XML Hammer "is a free and open-source tool that simplifies elementary XML actions like checking for well-formedness, validation, transformation and XPath searches using any JAXP implementation".
  • Xerces is a powerful XML parser that is now part of the JRE.
  • Crimson is a (now obsolete) XML parser that supports DOM, SAX and JAXP 1.1. It was used in the JRE before the switch to Xerces, and is a useful example for studying the inner workings of an XML parser.
  • dom4j, JDOM and XOM are alternative Java DOM APIs.
  • Xalan and Saxon are XSL-T processors.
  • Apache FOP is an XSL-FO processor that can output numerous formats, including PDF, PS, PCL, AFP, Print, AWT and PNG, and to a lesser extent, RTF and TXT.
  • Apache Santuario implements XML Signature and XML Encryption
  • JAXB is a Java <--> XML binding library.
  • Apache Commons Digester is an XML --> Java mapping library
  • NekoHTML, HtmlCleaner and TagSoup are libraries that clean up HTML and transform it to XML (thus allowing DOM and SAX to work with them).
  • a list of open source XML Diff and Patch tools



  • Certifications

    The formerly available IBM XML exams 141 and 142 have been retired on 12/31/2012. Online certifications are available at http://www.brainbench.com/ and http://www.xmlmaster.org/en/.

    These exam questions may help you gauge your XML knowledge, even if the associated exam is no longer available:

  • XML Design questions (by Ajith Kallambella)
  • Core XML (by Mapraputa Is)
  • DTD (by Sanjay Mishra and Dan Chisham)
  • DOM/SAX (by Kris VidhyaSagar)
  • XML 141 mock exam (by Shashank Tanksali)
  • IBM's XML Architecture prep guide



  • CategoryFaq XmlLinks
     
    Don't get me started about those stupid light bulbs.
      Bookmark Topic Watch Topic
    • New Topic