• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Help XML parsing in Java

 
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Dear users,
I'm writing this message because I would like to ask you some help to create a parser in java for the following XML :


It's 3 days that I'm working on it and I didn't come up with anything working. I'm just at the beginning with java xml parsing and that's probably why I didn't do well. I need to have the name of the clades (es "A B C") in groups according to the tree structure (branch length)( from the smallest group to the biggest group). Therefore I should have an ArrayList with each element representing a group of names (es: A,B,C ...) according to the branch length. For eg. A, E is an element of the arraylist , {B C D} is another one ... {B C}, {B C D A E}.
For this xml I shold have an ArrayList like this : [{D} , {B , C} , {A , E} , {B C D} , { A E B C D}] .
Can someone help me with the parsing? Many thanks.
Lucas

Ps: In the example I'm using names that are strings but in the actual file I need to use numbers (id) instead of strings. Sorry for the indentation btw.
 
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Fixed the indentation. Please do that yourself the next time, I don't think there's a chance of anyone understanding what's going on the way it was formatted.

What do you have so far? What kind of parser did you try? DOM, SAX, StAX, some other library...?
 
Lucas Halliwell
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You're right . I pasted the code and the website changed the indentation in that way.
The initial indentation was this :
http://i42.tinypic.com/2rp9o5w.jpg

I wrote this but it's not working how I want.


Ps: Thank you for the indentation and I'm sorry for my post but I'm really late with this project for the university. I have a deadline that is very close and I'm panicking
 
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

not working how I want.



That is about as useless a clue as it is possible to imagine.

What actually happens?
1. Computer catches fire
2 Blue screen of death
3. an exception is thrown
4. output is obtained but I don't like the way it indents
...... etc so many possibilities

Bill
 
Lucas Halliwell
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I get this output :

run:
Elemento: clade
Attributi: No attributes

Residue: BCDAE
Elemento: clade
Attributi: No attributes

Residue: BCDAE
Elemento: clade
Attributi: branch_length=4.25;

Residue: BCD
Elemento: clade
Attributi: branch_length=3.5;

Residue: BC
Elemento: clade
Attributi: branch_length=3.5;

Elemento: clade
Attributi: branch_length=3.5;

Elemento: clade
Attributi: branch_length=7.0;

Elemento: clade
Attributi: branch_length=10.25;

Residue: AE
Elemento: clade
Attributi: branch_length=1.0;

Elemento: clade
Attributi: branch_length=1.0;

My idea was to insert the values of the attribute residue into an ArrayList (after having discarded those which were "null" and repetitions)and thus have an ArrayList with all the groups of clades classified by their branch length from the smallest to the whole set of clades name. However using the method getTextContent() I get the descendants names concatenated without any white space and the output is missing of the D element.
 
Lucas Halliwell
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I did this and I'm stuck :



I don't know how to proceed to identify the clade names that are within a branch. Es I've got the branch with length 4,25 that has a clade name D , and another sub-branch with a clade name B and C. Can someone help me please??Thanks
 
Lucas Halliwell
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
This is the graphical representation of my xml file, just to make it clear what I wanna do.


I have to store into an array list all the possible groups. They must be gruped by their branch layout.
for this xml.file I have to have an arraylist as following
{{A,E} ,{B,C},{D}, {B,C,D} , {A,E,B,CD}}
 
William Brogden
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
See the javadocs for org.w3c.dom.Element - the getTextContent method "returns the text content of this node and its descendents" - exactly what you see.

If you just want something else you will have to use methods that are more precise - study the methods in the org.w3c.dom package - pay particular attention to the excellent table in the Node class which shows the different types of Nodes and the result of getNodeName() and getNodeValue() for each type.

Bill
 
Lucas Halliwell
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I had already read that documentation. I almost read everything releated to Dom and Sax but I don't know how to apply what I read to my context.
 
William Brogden
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Ah, a challenge - try this.

Imagine you have an assistant who can read text and locate / recognize words.

Now imagine handing that assistant a paper copy of the XML document.

What sequence of simple commands would you tell this assistant to locate the data elements you want, and having located them place them in a data structure.

This free online book has many examples of extracting data from XML.

Bill

 
Lucas Halliwell
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I guess I should use recursion to do that and tbh I'm not so good in that. I'm still trying to get something from the xml file but It's driving me crazy. Is there a way (method) to convert a NodeList to ArrayList?? That will solve my problem.
Moreover I should finish this by tmw/ end of Monday. I'm freaking out.
 
Marshal
Posts: 28226
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Lucas Halliwell wrote:Is there a way (method) to convert a NodeList to ArrayList?? That will solve my problem.



Well, sure.

1. Create a new empty ArrayList.

2. Write a loop which goes through the NodeList one entry at a time.

3. Add each of the entries to the ArrayList.

Does that solve your problem?
 
Ulf Dittmer
Rancher
Posts: 43081
77
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Here's an example of how to use recursion to traverse the DOM:


Given the XML you posted earlier it produces this:


 
Lucas Halliwell
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hey Ulf Dittmer may I ask you a suggestion on how to group the elements into an ArrayList according the layout of the tree? I should have an arrayList of arrayLists in which each element is a subtree. For eg. :
A[0]={{B,C,D},{A,E}} //top level of the tree representing the 2 sub-trees
A[1]={{B,C},{D}}// because there is a sub tree with those elements (decomposition of the right sub-tree)
A[2]={B,C} //this one is the sub-tree of the sub-tree (decomposition of the right sub-tree)
A[3]={D} // same of the previous(decomposition of the right sub-tree)
A[4]={A,E}// (decomposition of the left sub-tree)

I don't need into go to a level of deepness such that I have a single element not grouped (leafs).

Many thanks for the big help you're providing me.

For Paul: this is what should be the outcome of the parsing.
 
reply
    Bookmark Topic Watch Topic
  • New Topic