my dog learned polymorphism*
The moose likes XML and Related Technologies and the fly likes Stax - resolve duplicate balise name in XML Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCM Java EE 6 Enterprise Architect Exam Guide this week in the OCMJEA forum!
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "Stax - resolve duplicate balise name in XML" Watch "Stax - resolve duplicate balise name in XML" New topic
Author

Stax - resolve duplicate balise name in XML

mou mouse
Greenhorn

Joined: Sep 01, 2013
Posts: 5
Hi all,

I'd like to use Stax with the fonction of "peek" and "pop" to resolve duplicate balise name in XML. So please tell me where can I find a sample using these method.

thanks
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12769
    
    5
I have no idea what a "balise name" is - how about showing a SHORT xml example document with an example of what you are trying to detect. Also what do you want to do once it is detected.

Bill
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

I googled it and "balise" looks like the French word for "tag" (as in "the <a> tag of HTML"). But anyway, I agree with you, an example would be very useful.
mou mouse
Greenhorn

Joined: Sep 01, 2013
Posts: 5
thanks all for replaying,

for example I would like to parse an XML document which is like this:
<advisory>
<name>name1</name>
<testDescription>nom de test</testDescription>
<threat>
<name>thread1</name>
<reference>ref1</reference>
</threat>
</advisory>
like a remark the balise <name> is contained in parent <advisory> and <threat> So I should detect that to put them in respective objects. So wen I googled, I found that I can use Stack (peek, pop, ...) with staX to resolve this problem, but I can't find an example.
Please if you found an example or tuto send me the link.

thanks.
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12769
    
    5
This is not a problem for event oriented parsing - ie Stax or SAX.

By far the easiest way to look at XML document structure is to parse the document to a DOM and use the org.w3c.dom package classes and methods.

In saying "by far" I mean 10 or 100 times easier. The reason being that DOM knows about the parent/child relationships.

Only in the event of really really huge input documents would you require event oriented parsing.

Harold's free online book will have examples.

Bill
mou mouse
Greenhorn

Joined: Sep 01, 2013
Posts: 5
thanks Williams, but I should work by the method that I said before. it's not for me to choose :/ So if you have a link to sample send it to me

thanks.
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

Not for you to choose? That isn't what you said in your original post. But anyway, you don't need to use the peek and pop methods of the XMLEventReader (actually there isn't a pop method) to parse that XML document. Just read the elements and write code which keeps track of where you are in the document. For example are you in an "advisory" element or a "threat" element at the moment?
g tsuji
Ranch Hand

Joined: Jan 18, 2011
Posts: 511
    
    3
I'd like to use Stax with the fonction of "peek" and "pop" to resolve duplicate balise name in XML.

I wonder where do you come up with peek and pop if not put to you by your instructors or something?! They are referring to peek and pop of some instance of a Stack.

At the same time of running a XMLEventReader, you maintain a stack of QName: pushing element name when it has seen a StartElement and popping the element name at the head when it has seen an EndElement. But you decide to push the element name to the stack only after "peeking" its last-in entry (see below) so as to ascertain the current element's parent name.

When the event reader is at a StartElement, when its name is "name", you have the stack "peek" its entry at the top (last-in entry). If it is either advisory or threat, you know you find one searched element "name" who parent is either "advisory" or "threat". And then you do whatever desirable to the name tag so found.

That is the idea. Try to implement it yourself (show your effort). If you find difficulties in the implementation of it, I may or somebody may help you further if so desired and satisfied you've done your due.
mou mouse
Greenhorn

Joined: Sep 01, 2013
Posts: 5
Sorry because I'm new, and thanks you all for contribution, I finish an example:
My XML:


My Java Class:


it should be like this ? am I right ??
g tsuji
Ranch Hand

Joined: Jan 18, 2011
Posts: 511
    
    3
it should be like this ? am I right ??

I take it that the question is not directed to me but to others who answered to this thread, as that is not what I was saying in answering the original question (and more importantly the question being asked in the original context...) so I feel free to pass by only saying that it can be like that and it can be other than that. It depends on how the question is being asked and why it is asked that way. If you feel happy to code this way, why not?

The merit of the question asked in the original way (by your instructors) is that as StAX parsing is forward-only, how would you embrace this characteristic in case when a looking backward is needed locally to fulfill specific functionality. That is the generic meaning of the question.

But if you know every single detail of the document beforehand: which tag (name and kind) following which, then why not code like that in a rigid way, sacrificing flexibility which is not the purpose of it after all.
mou mouse
Greenhorn

Joined: Sep 01, 2013
Posts: 5
Hi g tsuji, I'm a new developper and my teacher tell me to develop this class to parse XML File by using StaX and I find many cases of duplicate tags that's not have the same parents.
I don't find any sample code that which contain that, so i demand a link or a small sample. But I don't have an issue right now. So I develop like I see it. I know it's not a good method.
So please if you have an idea how to correct it tell me please.
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12769
    
    5
my teacher tell me to develop this class to parse XML File by using StaX and I find many cases of duplicate tags that's not have the same parents.


Very important point here.

XML cares about the structure of a document so you are not finding "duplicate tags" you are finding Elements with duplicate names. Duplicate names may appear in perfectly valid XML documents - a very bad design to be sure but still valid XML.

Bill
g tsuji
Ranch Hand

Joined: Jan 18, 2011
Posts: 511
    
    3
I find many cases of duplicate tags that's not have the same parents.

If you want to work with xml, be sure not to develop prematurely any wrong opinion around this phenomenon (yes, duplicate tag name as said) in particular at this early stage when you may not really understand that much in the xml and the technologies developed around it in practice. It is perfectly normal. Is it bad design? Arguably... I find it hard to argument in the other direction, that's for sure. But I certainly won't even say it is bad design. It may cause some inconvenience to deal with it at time. If you have luxury to sit back and just do your own work, no reason to make inconvenience for yourself on a work of your own.

Since you want to deal with it somehow, I can show you how in the line of reasoning I had made explicit. Let me take your second xml. Let me take "code" tags. It may appear as child of "values" tag or as child of "materiel" tag. This is one way to get the info.
 
 
subject: Stax - resolve duplicate balise name in XML