Forum:

XML and Related Technologies

How to Implement an XMLReader

Ranch Hand

Posts: 103

posted 17 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Hello all,

I'm getting a little more savvy with XML and the Java TrAX API and now I'm implementing an XMLReader that presents objects as XML so that I can drop it into the TrAX API and start running transforms on our business objects. I need some advice on what to implement or extend. I want to do something like this:

My problem is I don't want to have to implement a bunch of methods related to stream reading in either my XMLReader or my CustomInputSource. I don't even want to implement a custom InputSource if I don't have to. However it sems like the TrAX API is geared more towards processing streams and readers. How do I cleanly introduce an arbtrary object as the source of the conversion? Any ideas?

Holla at me...<br /><a href="http://codeforfun.wordpress.com" target="_blank" rel="nofollow">http://codeforfun.wordpress.com</a>;

William Brogden

Author and all-around good cowpoke

Posts: 13078

posted 17 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

I really don't see the problem with creating a custom extension of org.xml.sax.InputSource. Seems to me that all you really have to fiddle with are the getByteStream and getCharacterStream implementations.
Another alternative would be to create a custom java.io.Reader - I actually did this for a reader that aggregated XML fragments from a variety of sources. You could have a Reader that turned a collection of business objects into a character stream.
What form are your business objects in now?
Bill
[ June 06, 2006: Message edited by: William Brogden ]

Clifton Craig

Ranch Hand

Posts: 103

posted 17 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Thanx William,

That's food for thought. My objects are in the form of in memory POJOs now. I want to pass them into my converter and have it generate SAX events that drive a transformation process. Initially I'll just connect the default transform handler and have it serialize to XML which I'll validate in my unit test. But after I validate the conversion I want to be able to transform the XML with XSLT. I know I could probably do this easier if I use a DOM source but I have more experience with SAX and I am trying to keep the memory down, though I don't think this scenario will initially use alot of memory. I'll consider implementing either an InputSource or a reader. I like your idea of implementing a custom Reader but I'm not sure I want to serialize or flatten my object graph for the conversion. I'm really just looking to traverse the graph and call SAX events as I go along. I'm trying to stay away from stream processing as much as possible.

Holla at me...<br /><a href="http://codeforfun.wordpress.com" target="_blank" rel="nofollow">http://codeforfun.wordpress.com</a>;

Paul Clapham

Marshal

Posts: 28226

I like...

posted 17 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Don't know if you've seen this:

http://www.cafeconleche.org/books/xmljava/chapters/ch08s05.html

I don't think you have to implement InputSource, you just need an object that can pass the relevant SAX events into a TransformerHandler (which is how I did it) or an XMLReader (which is how ERH did it). But then there's SAXSource, which does implement InputSource and presumably sends SAX events to its XMLReader -- this is pretty convoluted stuff. I always have to go back and look at chapter 8 of the ERH book until my brain gets washed properly.

Java 8 (verified skill)
Skill verified by Paul Clapham

William Brogden

Author and all-around good cowpoke

Posts: 13078

posted 17 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

I agree with Paul that this is pretty brain-bending stuff, but it seems to me that turning an object into a series of SAX events is conceptually exactly the logic as turning it into an XML text fragment. The SAX event route would be faster since it avoids lots of String conversions, for instance your characters() method event could just point to existing Strings.

I think if this was my problem I would write the object->XML text stream Reader approach first to clarify my thinking on the XML representation of the object. If performance was unsatisfactory, I would have the skeleton for turning it into object->SAXevents.
Bill

Clifton Craig

Ranch Hand

Posts: 103

posted 17 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Thanx guys,

That article is exactly what I had in mind. My confusion comes from not being sure how the xml reader/filter will be used. I want to be able to easily throw it into existing code that deals with normal XML transforms so from that angle I guess I need to represent an object as the InputSource of the transform. Also since the existing code already makes use of XMLFilters I'm thinking I need to represent my object to XML converter as a Filter. If I can throw in my object to XML converter as a filter and the object to convert as an input source then I need to make little if any changes to the existing code. In the article however, the filter expects only the parse(String) method to be called with a SQL query. I wasn't sure if I could get away with that because I don't know which parse() method will be called or exactly how to rope it in so only a certain method will be invoked. I know I sound confusing right now.

Here's my whole problem. when I have an XSLT and I want to apply it to the results of the object conversion how do I setup the filters? I'm thinking it will be something like this:

The converter is parent to the stylesheet. I think that makes sense. But here, I need an InputSource to pass to the stylsheet for parsing. I don't know what the XMLFilter created for the xslt will do with the inputsource so I'm lost as to which methods I need to re-implement/override on InputSource. I'm guessing it will just be handed to the parent converter, who will get it in it's parse method but I'm not sure. I'm also not sure what methods the filter chain will invoke on the parent converter. Can I get away with implementing just the one parse method that takes the InputSource? The above snip is how my existing code will work, only it's set to have the filters and sources injected rather than using hard references. I need to know what position to inject my converter (as the parent or child of the XSLT), how to present an object as the InputSource for the transform, and what will be expected in any interfaces/subclasses I implement. I hope that's a little more clear.

Holla at me...<br /><a href="http://codeforfun.wordpress.com" target="_blank" rel="nofollow">http://codeforfun.wordpress.com</a>;

Paul Clapham

Marshal

Posts: 28226

I like...

posted 17 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

My problem, a couple of years ago, was also to output SAX events to be immediately transformed by an XSL transformation. (It transformed to HTML which was sent to the browser.) So this is what I came up with (local details and exception handling removed):Now that I look at it again, it still looks inside out. Why do I even need that XMLReader anyway? But it works.

Note: when I say that "I came up with" that code, I have to say it was based heavily on somebody else's examples. I think it was examples that came with early versions of Saxon but I can't really remember.

Clifton Craig

Ranch Hand

Posts: 103

posted 17 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Originally posted by Paul Clapham:
SAXTransformerFactory factory = (SAXTransformerFactory) TransformerFactory.newInstance(); TransformerHandler handler = factory.newTransformerHandler(filename); SAXParserFactory spf = SAXParserFactory.newInstance(); XMLReader reader = spf.newSAXParser().getXMLReader(); reader.setContentHandler(handler); reader.setProperty("http://xml.org/sax/properties/lexical-handler", handler); reader.setFeature("http://xml.org/sax/features/namespaces", true); reader.setFeature("http://xml.org/sax/features/namespace-prefixes", false); handler.startDocument(); // and so on

Paul,

From your example it looks to me like you don't need an XMLReader. I can't tell for sure but it looks removable. Your code calls the hanlder directly after setting up the reader and unless it talks to the reader after invoking SAX events on the handler then the Reader adds nothing but distraction to the code. In fact, whatever object includes this code should implement XMLReader directly (or XMLFilter) such that it can be dropped into a transform chain via the TrAX API. That's just my take from what you've supplied.

Holla at me...<br /><a href="http://codeforfun.wordpress.com" target="_blank" rel="nofollow">http://codeforfun.wordpress.com</a>;

Consider Paul's rocket mass heater.