aspose file tools*
The moose likes XML and Related Technologies and the fly likes Trouble parsing XML Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of EJB 3 in Action this week in the EJB and other Java EE Technologies forum!
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "Trouble parsing XML" Watch "Trouble parsing XML" New topic
Author

Trouble parsing XML

James Gibbs
Greenhorn

Joined: Feb 21, 2013
Posts: 29

Am really having trouble parsing a XML file. Root level entries get turned into an object then that object is added to a arraylist. If there are any child elements they are turned into objects and added to the parent entries object to a arraylist inside that object. So I have a arraylist of catagory objects which each have their own arraylist of sub-catagory objects.

I think I've got pretty close with the code so far but am having difficulty creating the first catagory object. The XML parser I've implemented adds the object once it hits the final end tag. However all children entries come before that end tag, so it blows up. I just can't think of another way of doing it?



Junilu Lacar
Bartender

Joined: Feb 26, 2001
Posts: 4419
    
    5

Do you really have to do this yourself? There are frameworks out there that can do this: Apache Commons Digester and JAXB, for example.


Junilu - [How to Ask Questions] [How to Answer Questions]
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 36453
    
  15
Sounds too difficult a question for this forum: moving to our XML forum.
James Gibbs
Greenhorn

Joined: Feb 21, 2013
Posts: 29

I'd rather not do it myself if there's a framework available didn't even know there was such a thing so thanks for putting me on to it. I'm developing for Andriod and most the examples I've seen are similar to what I've shown.

I've done a bit of googling and JAXB is too large apparently to use in an android app, and apache commons digester doesn't look like it plays well. Hmmm. Found simple xml. Would I need to build a program to get the data in in the first place then serialize it? vs hand coding the xml data to my own specification?
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12675
    
    5
I must be missing something - why not just parse to a DOM with a standard parser and work with those objects?


Bill

Java Resources at www.wbrogden.com
James Gibbs
Greenhorn

Joined: Feb 21, 2013
Posts: 29

Haven't considered parsing to a DOM as I'm still trying to figure everything out. All the examples I've seen so far for android and xml have used xmlpullparser and then people constructing their own if-else logic, so I thought this was the standard way to go. My problem seems to be a simple xmlpullparser and if-else logic is frying my brain because of the amount of child nodes im dealing with, too many nested if-else statements. Where as all the examples seem to be simple parent - child elements.

I'm going to give either xml framework like simple xml or go the DOM route? I'm giving up with my previous attempts...even if I do get it to work if I make one small change to the xml file it'll be a nightmare
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7029
    
  16

James Gibbs wrote:I think I've got pretty close with the code so far...

James. Please DontWriteLongLines (←click); it makes your thread very hard to read.
I'd break them up myself, but there are tons of them, so I suggest you do it yourself (Use the 'Edit' button).

Thanks

Winston

Isn't it funny how there's always time and money enough to do it WRONG?
Artlicles by Winston can be found here
James Gibbs
Greenhorn

Joined: Feb 21, 2013
Posts: 29

Winston Gutkowski wrote:
James Gibbs wrote:I think I've got pretty close with the code so far...

James. Please DontWriteLongLines (←click); it makes your thread very hard to read.
I'd break them up myself, but there are tons of them, so I suggest you do it yourself (Use the 'Edit' button).

Thanks

Winston


Ok done
James Gibbs
Greenhorn

Joined: Feb 21, 2013
Posts: 29

I've gone down the DOM route to try and simplify things. This time I create a map for each catagory/sub catagory. This works. But I now need a way to add an extra field in the map for each sub-catagory saying who it's parent is catagory is. But I can't seem to extract this. Here's what I've come up with so far:



That last line however doesn't work.

I'm really at the end of my tether trying to figure this out. I'm tempted just to fudge it, ignore the xml hierarchy and for each node write in the xml itself who it's parent is:

Junilu Lacar
Bartender

Joined: Feb 26, 2001
Posts: 4419
    
    5

James Gibbs wrote:
I'm really at the end of my tether trying to figure this out. I'm tempted just to fudge it, ignore the xml hierarchy and for each node write in the xml itself who it's parent is:


Yuck. That's ugly. I would not do it that way.

If you're going to do this semi-manually using a DOM, then the processing is going to be recursive because of the fact that a BrowseNode can contain a list of children BrowseNodes and so on. The first thing is to get your object structure right.

BrowseNode has this set of attributes: id, name, List<BrowseNode> children.

Then break the problem down into component parts. The trickiest part is recursing into the next level when you get to a Children tag. It's tricky but not that tricky:



This is just off the top of my head but that's the general algorithm of the recursive processing you have to do. I've left out some things for you to figure out yourself.

Personally, I would just use the Simple XML framework for Android. It looks like it would do the job handily.
James Gibbs
Greenhorn

Joined: Feb 21, 2013
Posts: 29

Thanks Junilu, your example has given me a lot to go on. I did it the ugly way in the end but think I'll re-write it using your pointers.

I took another look at simple xml, and it does look great. The problem is it looks like I'd have to build the objects first, then deserialize it to xml. Which would mean I'd have to build a separate program to do a one off manual input run for each catagory. Seems like a lot of overhead when I can just write the xml out manually once?
Junilu Lacar
Bartender

Joined: Feb 26, 2001
Posts: 4419
    
    5

James Gibbs wrote:The problem is it looks like I'd have to build the objects first, then deserialize it to xml. Which would mean I'd have to build a separate program to do a one off manual input run for each catagory. Seems like a lot of overhead when I can just write the xml out manually once?

Just to make sure we're on the same page here, deserializing means taking XML and creating an object graph/hierarchy from it, whereas serializing is taking an object graph/hierarchy and generating XML. I thought your goal was to deserialize some XML.

I'm not clear on what you mean by having to "build a separate program to do one off manual input run for each catagory[sic]" (correct spelling is category, BTW). It seems to me that after you've defined your objects and annotated them appropriately for Simple XML framework, all you would need to do is something like their deserialization example:


Am I missing something?
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12675
    
    5
I have not really been following this but the following usage of getFirstChild():



is absolutely deadly. Sometimes (frequently) that first child will be a Text type Node that you don't even think about when working on the logic.

Bill
James Gibbs
Greenhorn

Joined: Feb 21, 2013
Posts: 29

William Brogden wrote:I have not really been following this but the following usage of getFirstChild():



is absolutely deadly. Sometimes (frequently) that first child will be a Text type Node that you don't even think about when working on the logic.

Bill


Thanks for the input. I didn't use that implementation in the end. By" text type node" do you mean that the text of the first child node might not be what you were expecting?
James Gibbs
Greenhorn

Joined: Feb 21, 2013
Posts: 29

Junilu Lacar wrote:
James Gibbs wrote:The problem is it looks like I'd have to build the objects first, then deserialize it to xml. Which would mean I'd have to build a separate program to do a one off manual input run for each catagory. Seems like a lot of overhead when I can just write the xml out manually once?

Just to make sure we're on the same page here, deserializing means taking XML and creating an object graph/hierarchy from it, whereas serializing is taking an object graph/hierarchy and generating XML. I thought your goal was to deserialize some XML.

I'm not clear on what you mean by having to "build a separate program to do one off manual input run for each catagory[sic]" (correct spelling is category, BTW). It seems to me that after you've defined your objects and annotated them appropriately for Simple XML framework, all you would need to do is something like their deserialization example:


Am I missing something?


Sorry if I'm not making much sense and thanks for trying to decode my current confused logic! My original aim was to take a xml file (which I had already hand typed) then deserialize it to my custom objects. From what I could see Simple XML didn't let you deserialize an XML file of custom objects which wasn't first serialized using Simple XML itself.

My needs have now changed and I'll calling the Amazon Product API for the XML files. I've taken another look at Simple XML and am I right in thinking I could use the templating feature to build a template which understands the schema of the xml amazon returns?
Junilu Lacar
Bartender

Joined: Feb 26, 2001
Posts: 4419
    
    5

James Gibbs wrote:
My original aim was to take a xml file (which I had already hand typed) then deserialize it to my custom objects. From what I could see Simple XML didn't let you deserialize an XML file of custom objects which wasn't first serialized using Simple XML itself.

No, you DO NOT need XML that was first serialized by the framework in order to use the deserialization feature. XML is XML, it doesn't matter whether you created it manually or programmatically, the deserialization feature will work as long as you have the proper annotations in your objects.


My needs have now changed and I'll calling the Amazon Product API for the XML files. I've taken another look at Simple XML and am I right in thinking I could use the templating feature to build a template which understands the schema of the xml amazon returns?

I doubt that the templating feature is what you need. The templating feature is useful for when you have XML that contains placeholder tokens of the form "${token.name}" and you want to replace these tokens with actual values which you have in memory.

Coincidentally, this is something I'm actually doing right now in my current project at work. We need to send a request to a web service and we get input from the user for certain fields. Certain other fields are calculated, such as the "releaseId", so what we do is we have a RequestTemplate that we deserialize from XML. The XML that we deserialize to the RequestTemplate has placeholder tokens for the calculated fields, e.g.

We first deserialize the template XML. Then we replace any tokens with calculated values. Then we set the other attributes with values that the user entered. Finally, we serialize the RequestTemplate to get the actual XML that we need to send to the web service.

We don't use Simple XML but the idea is the same: you deserialize the template XML then you look for tokens and replace them with values that you have in memory. Simple XML makes this easy to do because all you have to do is create a filter and give it a Map of tokens and the corresponding values to substitute in their place. When you deserialize, the objects will already have appropriate fileds populated with the values substituted from the filter map.

I can't imagine that the Amazon XML has placeholder tokens in it but if it does, then you can probably use the templating feature. Otherwise, just figure out how to annotate your objects in a way that Simple XML can match it with incoming XML.
Junilu Lacar
Bartender

Joined: Feb 26, 2001
Posts: 4419
    
    5

James Gibbs wrote:
... build a template which understands the schema of the xml amazon returns?

I think you need to understand how to structure and annotate your objects for Simple XML. Post some of the Amazon XML that you're trying to deserialize and I'll try to help you define the appropriate object structure and annotations.
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12675
    
    5
Thanks for the input. I didn't use that implementation in the end. By" text type node" do you mean that the text of the first child node might not be what you were expecting?


No, I mean that the first child can be a Node of type TEXT_NODE - consider the following XML fragment:

The first child of A is NOT b but a Node of type TEXT_NODE containing three characters, cr lf space
Following Element b there is another Node of type TEXT_NODE containing two character, cr lf

Incidentally you should get familiar with the table in the JavaDocs for org.w3c.dom.Node

Bill


James Gibbs
Greenhorn

Joined: Feb 21, 2013
Posts: 29

Junilu Lacar wrote:
as long as you have the proper annotations in your objects.


This was completely new to me, had no idea you could use java annotations like this.

Junilu Lacar wrote:
I doubt that the templating feature is what you need. The templating feature is useful for when you have XML that contains placeholder tokens of the form "${token.name}" and you want to replace these tokens with actual values which you have in memory.

I can't imagine that the Amazon XML has placeholder tokens in it but if it does, then you can probably use the templating feature. Otherwise, just figure out how to annotate your objects in a way that Simple XML can match it with incoming XML.


Ah I see so I had completly the wrong idea of the templating/placeholder feature. Makes sense how you describe it thanks, might come in handy one day.

I've finished building a class to pull the list of BrowseNodes (list of sub-categories elements) from amazon. The XML it returns looks like this:



I'm only interested in the BrowseNodes under the Children element
Junilu Lacar
Bartender

Joined: Feb 26, 2001
Posts: 4419
    
    5

FWIW, Simple XML was a life saver for me today. We were having problems with JAXB on our enterprise WAS 6.1 infrastructure --was getting a VerifyError--and I couldn't find help on how to fix it anywhere. I switched over to Simple Framework for XML and it works like a charm. I'm pretty sure it's some kind of JAR version problem on our customized enterprise WAS 6.1 install because it (the JAXB code) works fine on a standard development WAS 6.1 installation. I didn't have time or resources to track the problem down and it was really easy to switch from the JAXB annotations to the Simple annotations. "Simple is better" takes on a whole new meaning now.
James Gibbs
Greenhorn

Joined: Feb 21, 2013
Posts: 29

Sounds like its quite a nice tool to have. I was able to plug the amazon xml into my existing xml parsing code with minimal code change and have it work, so didn't bother with simple xml. But if I ever have to touch xml on android again unless its very basic it'll be simple xml all the way.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Trouble parsing XML
 
Similar Threads
java code and SAX
remove elements from XML file using XPP Parser
parsing data and storing in the xml
Can 1 same comparator works for Double and String object type's like this ?
Comparator- do it looks like problem?