File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes OO, Patterns, UML and Refactoring and the fly likes Converting XML and Builder pattern Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » OO, Patterns, UML and Refactoring
Bookmark "Converting XML and Builder pattern" Watch "Converting XML and Builder pattern" New topic
Author

Converting XML and Builder pattern

Alan Sandhurst
Greenhorn

Joined: Apr 30, 2004
Posts: 21
Would be interested in anyone's ideas on how to solve this design conundrum. (BTW XLST is not an option in my case)

I have the following scenario- I need to parse an XML file that represents a course syllabus. I've pasted the XML below. My requirement is to parse each element in the list, e.g. syllabus, course etc, and render this into a different XML format, different in names and structure (i.e. the element names are different and the structure is also different, e.g. some values will be set as attributes of elements)

For each element, I convert to the new structure, and then persist this. Using the retrieved ID for the newly inserted item, I must track parent child relationships, i.e. a course can't be inserted without reference to an existing syllabus that it belongs to.

My initial approach, to promote code re-use, was to have an abstract superclass, call it XMLParser. It has a processNode method that takes each node and extracts the data from it and then generates the new format of XML. For each element type I instantiate a particular parser, e.g. SyllabusParser, CourseParser, etc. They rely on the superclass for common functionality, and flesh out their specific implementations where required.

However, this approach is getting quite complex, and it's very difficult to unit test.

I'm considering moving to a different approach, using composition instead of inheritance. But my brain is fried and I can't step back from the code to get a clear view. Some notions I have about how I'd proceed:

For each element type in the tree, instantiate an implementation of an interface that knows how to process it.

Use a type of builder or assembler to build the XML fragments for each element type.

Hope that's enough to give an idea of what I'm trying to achieve. Patterns I think are relevant are Factory, Builder, Strategy.

XML format:

Jimmy Clark
Ranch Hand

Joined: Apr 16, 2008
Posts: 2187
Below are the technical requirements.

(1) Process an XML document that contains lists of courses and related syllabi and create an new XML document.

(2) A course cannot be inserted into the new document unless there is an existing referenced syllabus.

(3-N) ...

{ Here you need to specify the conversion requirements in regards to the new XML format. In other words what data will go where and into what element or attribute. }

...

This is it. Everything else that you have mentioned is design and you are swimming in all types of terms and ideas. The requirements are not that difficult. The main requirement is that you cannot include courses without an syllabus. So, all you really need to do is read through the XML file and create a series of Collections containing data objects. With this in place, your code for producing the new XML document should be fairly simple.


1. Read existing XML document

2. Create data objects

3. Write new XML document (using requirements (3-N) above as a guide)

Note, a good understanding of the JAXP API and either SAX, StAX, DOM API is required.

Good luck!
Alan Sandhurst
Greenhorn

Joined: Apr 30, 2004
Posts: 21
Thanks for the helpful feedback. Hmm..I think you're right. I've been so immersed in this code that I can't see the simple requirements I have.

I still have one issue, this is why I was considering a Builder approach. When I convert from one XML format to another, a lot of my element types are broadly similar, so there will be a basic template I can use for these. But for certain elements, the translation will be similar but different. I'm trying to find an elegant way to achieve this while avoiding duplication.

My current approach is using a helper class with static methods that just returns a pre-populated boilerplate Document, to which I append elements I need. These varying elements are handled by subclasses in my hierarchy.

But a) I don't like having lots of static methods, and b) I don't like this hierarchy, it is quite brittle- any time I want to change something I have low confidence that I won't break something else.

This is why I'm considering a Builder approach to generate the transformed XML, and using interfaces for each of my new element types, if that makes sense.

I'm fairly familiar with JAXP, and I'm using DOM in this project as it's already being used in existing code.

Basically I want to refactor away from a hierarchy based approach to one that uses interfaces, so composition instead of inheritance.My inheritance only makes sense in terms of code reuse.
I also want to find a clean design for building XML elements that share some commonality but also have quite a bit of variation.
Jimmy Clark
Ranch Hand

Joined: Apr 16, 2008
Posts: 2187
Don't think that you need to create the new XML document at the same time as reading the old XML document. This is not a requirement. And it will make for a very complex design and code. This approach is not required and should be avoided.

Read the old XML and create data objects.

Then, using the data objects , create the new XML document. This is much easier and cleaner and will enable you to create a clean, understandable application, that others will be able to easily read in the future.

All of your ideas about refactoring and inheritence and composition are not requirements. These are design concepts that are clouding your ability to write code for what is required.

Think about reading an XML document and creating data objects for syllabus, course, etc.

Store these objects in a ArrayList or an appropriate Collection class.

Now, think about creating an XML document by reading these Collections.

Read through the Collections and create an XML document, this is all that is needed.

You could even create the XML document with a Stringbuffer class and an Iterator. append.append.append.

Try to make the application as simple and easy as possible. This is the best way. Simple and easy. Easy and simple.

What you have to do is not complex enough for an object-oriented design pattern. Don't try to make it more complex than what is required.

Good luck!
Alan Sandhurst
Greenhorn

Joined: Apr 30, 2004
Posts: 21
Thanks again for the feedback- I've taken some suggestions on board, particularly the separation of reading and writing, and it's helping a lot.

However, I maybe didn't make clear that for each data object I extract, I have to build an XML fragment for it and persist that, I can't build one XML document for everything due to the legacy code I'm working with.

Currently I have a simple parser that extracts data types from the XML document and validates their place in the hierarchy.

For building the XML to persist the data types, I instantiate a concrete implementation of a ContentNode interface, and start creating the XML for insertion. This is where I'm still having trouble finding a clean solution.

It's easy to build up individual fragments, but each fragment is slightly different: its structure is different, and what I need to do with it is different, for example, for a syllabus, I need to delete existing syllabi before I insert. A syllabus also has no parent, so a parentId element is not required. I also need to subscribe a user to a syllabus after it is created.

For some types, I need to add an external mapping id, for others I don't.

To prevent having copy and paste everywhere, I'm using a helper class with static methods to retrieve the common
structures I need. The problem with this is it's not very amenable to change, and this system is going to keep changing; there are new requirements coming in all the time.

I agree that simple and easy is good, but I think there is an argument for some kind of OO solution that will make it easier to modify.


Jimmy Clark
Ranch Hand

Joined: Apr 16, 2008
Posts: 2187
However, I maybe didn't make clear that for each data object I extract, I have to build an XML fragment for it and persist that, I can't build one XML document for everything due to the legacy code I'm working with.


I see. There would be little difference between producing one XML document from your Java data objects or producing many XML fragments from your Java data objects. You will have small code modules that produce the fragments instead of a single module that produces an XML document.

Since each fragment is different and has different business rules, then you should have a distinct code module that is specific to the fragment. There is no need to try and combine the logic and mix everything up. This will create ugly, over complex code that will be hard to manage and maintain. Some of the logic might be repeated, don't worry about this.

I agree that simple and easy is good, but I think there is an argument for some kind of OO solution that will make it easier to modify.


My statement about simple and easy did not refer to any other solution than an object-oriented one. To be extra clear, "a simple and easy object-oriented solution is good." Simple and easy to understand and maintain is more valuable than code reuse.

First write the code that handles the business requirements. Once you have it working. Then read the requirements again and write some more code that might make it easier to maintain. Then read the requirements again and write some more code, if needed, to streamline execution paths and prepare it for future, possible changes.

Alan Sandhurst
Greenhorn

Joined: Apr 30, 2004
Posts: 21
James Clark wrote:
To be extra clear, "a simple and easy object-oriented solution is good." Simple and easy to understand and maintain is more valuable than code reuse.


Excellent, that's the nub of what I was stuck on. I've been overly worried about a small amount of duplication, the elimination of which has been tying me in knots.

I appreciate the very helpful advice. I'll be noting this exchange for future reference.

Many thanks
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
 
subject: Converting XML and Builder pattern
 
Similar Threads
Structure of SOAP response
nested sql queries using JSP - pls help
Extra column created
XML Notes - IV
xsl - differentiating nodes