• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Skipping XML events using StAX

 
Siva Vulchi
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I have a requirement to skip the unnecessary XML events. And have following code for that.
XMLInputFactory xmlif = XMLInputFactory.newInstance();
XMLStreamReader xmlr = xmlif.createXMLStreamReader(new FileInputStream("XML file path"));
while (xmlr.hasNext()) {
switch (xmlr.getEventType()) {
case XMLEvent.START_ELEMENT:
// do some processing
xmlr.next();
break;
case XMLEvent.END_ELEMENT:
// do some processing
xmlr.next();
break;
}
}
And i have heard about StreamFilters with which events can also be skipped.
public class FilterImpl
implements StreamFilter {
public boolean accept(XMLStreamReader myReader) {
if (myReader.isStartElement()
|| myReader.isEndElement()) {
return true;
} else {
return false;
}
}
}

And i have thousands of XML records to iterate through. So could you please tell me the efficient way among these two for skipping the XML events?


Siva
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13058
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Since the basic time consuming work of a parser is to make sense of a stream of characters in terms of XML events, and the logic you are talking about occurs after an event has been parsed, I think it is very unlikely that there will be any significant difference.


However, you have a golden opportunity to add to the total knowledge base on the ranch by timing the two approaches. Let us know what happens.

My rule in such cases is to go with the approach that results in the cleanest code.

Bill
 
Siva Vulchi
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Bill.

Will definitely post the timing results of those 2 approaches.

Siva
 
Siva Vulchi
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Tried to timing those two approaches and found that normal approach was slightly better that stream filter approach.

My input XML file has 8600 nodes and each node has 6 start elements & 6 end elements.

When i ran the below program, normal approach took ~133 ms and stream filter approach took ~139 ms.

public class StaxComparison {

public static void main(String[] args) {
long start;
long avg = 0;

int i = 0;
while (i < 100) {
start = System.currentTimeMillis();
processFilterReader();
avg += System.currentTimeMillis() - start;
i++;
}
System.out.println(avg / 100);
}

public static void processReader(){
try{
XMLInputFactory xmlif = XMLInputFactory.newInstance();
XMLStreamReader xmlr = xmlif.createXMLStreamReader(new FileInputStream("test.xml"));
while(xmlr.hasNext()){
xmlr.next();
switch(xmlr.getEventType()){
case XMLEvent.START_ELEMENT:
{
//System.out.println(xmlr.getLocalName());
}
break;

case XMLEvent.END_ELEMENT:
{
//System.out.println(xmlr.getLocalName());
}
break;
}
}
xmlr.close();
}catch(Exception e){
e.printStackTrace();
}
}

public static void processFilterReader(){
try{
XMLInputFactory xmlif = XMLInputFactory.newInstance();
XMLStreamReader xmlr = xmlif.createFilteredReader(
xmlif.createXMLStreamReader(new FileInputStream("test.xml")), new MyStreamFilter());
while(xmlr.hasNext()){
xmlr.next();
//System.out.println(xmlr.getLocalName());
}
xmlr.close();
}catch(Exception e){
e.printStackTrace();
}
}
}

class MyStreamFilter implements javax.xml.stream.StreamFilter {
public boolean accept(XMLStreamReader reader) {

if (reader.isStartElement() || reader.isEndElement())
return true;
else
return false;
}
}

Siva
 
Paul Clapham
Sheriff
Posts: 20980
31
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
In other words there's basically no difference.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic