• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Paul Clapham
  • Devaka Cooray
  • Liutauras Vilda
Sheriffs:
  • Jeanne Boyarsky
  • paul wheaton
  • Henry Wong
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Tim Moores
  • Carey Brown
  • Mikalai Zaikin
Bartenders:
  • Lou Hamers
  • Piet Souris
  • Frits Walraven

how to extract part of data from XML file

 
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I have an XML file like this:

<?xml version="1.0" encoding="UTF-8"?>
<BILLING11><CARRIER>Verizon</CARRIER>
<DID>999998</DID>
<GMT>20071113</GMT>
<BR><EI>1768221198</EI>
<CID>200</CID>
<ADS>211</ADS>
</BR>
<BR><EI>1768221200</EI>
<CID>200</CID>
<ADS>219</ADS>
</BR>
<CT>2</CT>

Here I have 2 records with <BR>, where CT tells the count.

I want to extract each record and write them into new XML file.
For example, i want to extract the following from above XML file.

<BR><EI>1768221198</EI>
<CID>200</CID>
<ADS>211</ADS>
</BR>

Any idea how to do this?

Thanks in advance.
Vishwa
 
Ranch Hand
Posts: 186
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You can either use SAX Parser or DOM Parser.
sudha
 
Ranch Hand
Posts: 126
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
StAX is the most appropriate parser to efficiently read parts of xml
You can move through XML data like a cursor using stax.

thanks
Raees
 
Bartender
Posts: 1638
IntelliJ IDE MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
By any chance do you have an xsd for this xml?
If yes you can use xmlbeans, that will do two things:

  • Hide the parser implementation
  • You can use XPath/XQuery to point to any element inside the xml and work on it


  • If not, you can use any open source xpath engine that will do the trick for you. Parsing the whole xml using any sort of parser will anytime be more cryptic and give you aweful performance as compared to XPath.
     
    Author and all-around good cowpoke
    Posts: 13078
    6
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator

    Parsing the whole xml using any sort of parser will anytime be more cryptic and give you aweful performance as compared to XPath.



    Alas, this is incorrect. XPath has to work on top of the standard library so will always be slower. The advantage of XPath is clarity of expression and reduced lines of code, not speed.

    I did some timing experiments for this article - it is a big difference.

    Bill
     
    sudha swami
    Ranch Hand
    Posts: 186
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    Hi,
    Correct if am right.

    If i want to parse the part of XML Data with out considering the speed, then XPATH would be better than SAX/DOM Parser.

    regards
    sudha
     
    Marshal
    Posts: 28293
    95
    Eclipse IDE Firefox Browser MySQL Database
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    No, that isn't even wrong. You can't use XPath until you have already parsed the XML using the DOM parser.
     
    sudha swami
    Ranch Hand
    Posts: 186
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    thanks for the info
     
    Nitesh Kant
    Bartender
    Posts: 1638
    IntelliJ IDE MySQL Database Java
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator

    William:
    Alas, this is incorrect. XPath has to work on top of the standard library so will always be slower. The advantage of XPath is clarity of expression and reduced lines of code, not speed.



    Do you think this will be a generic behavior or it also depends on the XPath engine implementation? (I wanted to do some R&D and come up with results but i really did not have time, so asking for an opinion.Obviously the comparison must be for the same parsing methodology used in XPath engine and otherwise)
    I was just wondering that it does not make sense for the performance to go down alarmingly using XPath as compared to the node wise search, atleast for the simple xpath that is used in your article. More so if the xpath is pre-compiled. What do you say?
     
    William Brogden
    Author and all-around good cowpoke
    Posts: 13078
    6
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator

    I was just wondering that it does not make sense for the performance to go down alarmingly using XPath as compared to the node wise search, atleast for the simple xpath that is used in your article. More so if the xpath is pre-compiled. What do you say?



    I say it makes perfect sense. No magic is going on here, XPath interpretation has only the tools in org.w3c.dom (in whatever actual implementation) to work with. The search has to interpret the path in those terms.

    I suggest you take a look at the XPath specification (for example XPath 2.0) - it is always in terms of a DOM.

    Now - if you want to try to figure out an XPath-like high speed scan using SAX or StAX - great, but it wont be XPath. XPath-like syntax seems to be getting popular - for example this Apache project applying the syntax to object graphs!

    Bill
     
    I want my playground back. Here, I'll give you this tiny ad for it:
    We need your help - Coderanch server fundraiser
    https://coderanch.com/wiki/782867/Coderanch-server-fundraiser
    reply
      Bookmark Topic Watch Topic
    • New Topic