aspose file tools*
The moose likes XML and Related Technologies and the fly likes Search large xml files Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of EJB 3 in Action this week in the EJB and other Java EE Technologies forum!
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "Search large xml files" Watch "Search large xml files" New topic
Author

Search large xml files

Dean Reedy
Ranch Hand

Joined: Sep 10, 2001
Posts: 89
What I need to do is search XML files which possibly could be as large as 1 mb. I will be searching by allowing the user to type in a word or words and searching the whole xml document for occurences of that word(s). Now I will be searching all the nodes, the attributes as well as the text/data.
Now I have used a xml dom and sax and the searching takes a couple of minutes sometimes.
What is the fastest way to search an xml document? What can I do to speed up my search times.
Michael Ernest
High Plains Drifter
Sheriff

Joined: Oct 25, 2000
Posts: 7292

DOM of course is going to take a lot of time up front, building a tree in memory. SAX essentially guarantees a method call on every element. Both approaches are predicated on the idea that context is as important as the word you want to find.
If all you really want to do is find a string, don't use either tool. Put the parser tools away, and just use regular expression pattern matching to find it.


Make visible what, without you, might perhaps never have been seen.
- Robert Bresson
Tim Holloway
Saloon Keeper

Joined: Jun 25, 2001
Posts: 15632
    
  15

... Or consider an XML-structured DBMS.


Customer surveys are for companies who didn't pay proper attention to begin with.
Ajith Kallambella
Sheriff

Joined: Mar 17, 2000
Posts: 5782
A combination of JDOM with custom data structures to facilitate search/lookup (eg. hashtables ) can perfom a lot better than plain DOM.


Open Group Certified Distinguished IT Architect. Open Group Certified Master IT Architect. Sun Certified Architect (SCEA).
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Search large xml files
 
Similar Threads
Oracle Text: Can we use it to read/write Microsoft Excel and Word?
Displaying document contents
XML Searching via java
Development of a tool
Need to search large log files present in a remote unix server