File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes XML and Related Technologies and the fly likes Help XML parsing in Java Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "Help XML parsing in Java " Watch "Help XML parsing in Java " New topic
Author

Help XML parsing in Java

Lucas Halliwell
Greenhorn

Joined: Jul 17, 2013
Posts: 8
Dear users,
I'm writing this message because I would like to ask you some help to create a parser in java for the following XML :


It's 3 days that I'm working on it and I didn't come up with anything working. I'm just at the beginning with java xml parsing and that's probably why I didn't do well. I need to have the name of the clades (es "A B C") in groups according to the tree structure (branch length)( from the smallest group to the biggest group). Therefore I should have an ArrayList with each element representing a group of names (es: A,B,C ...) according to the branch length. For eg. A, E is an element of the arraylist , {B C D} is another one ... {B C}, {B C D A E}.
For this xml I shold have an ArrayList like this : [{D} , {B , C} , {A , E} , {B C D} , { A E B C D}] .
Can someone help me with the parsing? Many thanks.
Lucas

Ps: In the example I'm using names that are strings but in the actual file I need to use numbers (id) instead of strings. Sorry for the indentation btw.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41107
    
  45
Fixed the indentation. Please do that yourself the next time, I don't think there's a chance of anyone understanding what's going on the way it was formatted.

What do you have so far? What kind of parser did you try? DOM, SAX, StAX, some other library...?


Ping & DNS - my free Android networking tools app
Lucas Halliwell
Greenhorn

Joined: Jul 17, 2013
Posts: 8
You're right . I pasted the code and the website changed the indentation in that way.
The initial indentation was this :
http://i42.tinypic.com/2rp9o5w.jpg

I wrote this but it's not working how I want.


Ps: Thank you for the indentation and I'm sorry for my post but I'm really late with this project for the university. I have a deadline that is very close and I'm panicking
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12761
    
    5
not working how I want.


That is about as useless a clue as it is possible to imagine.

What actually happens?
1. Computer catches fire
2 Blue screen of death
3. an exception is thrown
4. output is obtained but I don't like the way it indents
...... etc so many possibilities

Bill
Lucas Halliwell
Greenhorn

Joined: Jul 17, 2013
Posts: 8
I get this output :

run:
Elemento: clade
Attributi: No attributes

Residue: BCDAE
Elemento: clade
Attributi: No attributes

Residue: BCDAE
Elemento: clade
Attributi: branch_length=4.25;

Residue: BCD
Elemento: clade
Attributi: branch_length=3.5;

Residue: BC
Elemento: clade
Attributi: branch_length=3.5;

Elemento: clade
Attributi: branch_length=3.5;

Elemento: clade
Attributi: branch_length=7.0;

Elemento: clade
Attributi: branch_length=10.25;

Residue: AE
Elemento: clade
Attributi: branch_length=1.0;

Elemento: clade
Attributi: branch_length=1.0;

My idea was to insert the values of the attribute residue into an ArrayList (after having discarded those which were "null" and repetitions)and thus have an ArrayList with all the groups of clades classified by their branch length from the smallest to the whole set of clades name. However using the method getTextContent() I get the descendants names concatenated without any white space and the output is missing of the D element.
Lucas Halliwell
Greenhorn

Joined: Jul 17, 2013
Posts: 8
I did this and I'm stuck :



I don't know how to proceed to identify the clade names that are within a branch. Es I've got the branch with length 4,25 that has a clade name D , and another sub-branch with a clade name B and C. Can someone help me please??Thanks
Lucas Halliwell
Greenhorn

Joined: Jul 17, 2013
Posts: 8
This is the graphical representation of my xml file, just to make it clear what I wanna do.


I have to store into an array list all the possible groups. They must be gruped by their branch layout.
for this xml.file I have to have an arraylist as following
{{A,E} ,{B,C},{D}, {B,C,D} , {A,E,B,CD}}
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12761
    
    5
See the javadocs for org.w3c.dom.Element - the getTextContent method "returns the text content of this node and its descendents" - exactly what you see.

If you just want something else you will have to use methods that are more precise - study the methods in the org.w3c.dom package - pay particular attention to the excellent table in the Node class which shows the different types of Nodes and the result of getNodeName() and getNodeValue() for each type.

Bill
Lucas Halliwell
Greenhorn

Joined: Jul 17, 2013
Posts: 8
I had already read that documentation. I almost read everything releated to Dom and Sax but I don't know how to apply what I read to my context.
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12761
    
    5
Ah, a challenge - try this.

Imagine you have an assistant who can read text and locate / recognize words.

Now imagine handing that assistant a paper copy of the XML document.

What sequence of simple commands would you tell this assistant to locate the data elements you want, and having located them place them in a data structure.

This free online book has many examples of extracting data from XML.

Bill

Lucas Halliwell
Greenhorn

Joined: Jul 17, 2013
Posts: 8
I guess I should use recursion to do that and tbh I'm not so good in that. I'm still trying to get something from the xml file but It's driving me crazy. Is there a way (method) to convert a NodeList to ArrayList?? That will solve my problem.
Moreover I should finish this by tmw/ end of Monday. I'm freaking out.
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18541
    
    8

Lucas Halliwell wrote:Is there a way (method) to convert a NodeList to ArrayList?? That will solve my problem.


Well, sure.

1. Create a new empty ArrayList.

2. Write a loop which goes through the NodeList one entry at a time.

3. Add each of the entries to the ArrayList.

Does that solve your problem?
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41107
    
  45
Here's an example of how to use recursion to traverse the DOM:


Given the XML you posted earlier it produces this:


Lucas Halliwell
Greenhorn

Joined: Jul 17, 2013
Posts: 8
Hey Ulf Dittmer may I ask you a suggestion on how to group the elements into an ArrayList according the layout of the tree? I should have an arrayList of arrayLists in which each element is a subtree. For eg. :
A[0]={{B,C,D},{A,E}} //top level of the tree representing the 2 sub-trees
A[1]={{B,C},{D}}// because there is a sub tree with those elements (decomposition of the right sub-tree)
A[2]={B,C} //this one is the sub-tree of the sub-tree (decomposition of the right sub-tree)
A[3]={D} // same of the previous(decomposition of the right sub-tree)
A[4]={A,E}// (decomposition of the left sub-tree)

I don't need into go to a level of deepness such that I have a single element not grouped (leafs).

Many thanks for the big help you're providing me.

For Paul: this is what should be the outcome of the parsing.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Help XML parsing in Java
 
Similar Threads
Group By Question
xml element content is a XSD .... validation fails.
Conversion of Raw Soap Response Into HTML
Print out values of xml tags
DCD validating parser