Granny's Programming Pearls
"inside of every large program is a small program struggling to get out"
JavaRanch.com/granny.jsp
The moose likes XML and Related Technologies and the fly likes seeking your  advice in choosing XML or RDBMS for Storing data Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCM Java EE 6 Enterprise Architect Exam Guide this week in the OCMJEA forum!
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "seeking your  advice in choosing XML or RDBMS for Storing data" Watch "seeking your  advice in choosing XML or RDBMS for Storing data" New topic
Author

seeking your advice in choosing XML or RDBMS for Storing data

Ramina Nibilian
Ranch Hand

Joined: Mar 30, 2005
Posts: 65
Hi , thank you for reading my post.
I want to make a dictionay for english/locale words. so i have too much lookup operation in my dictionary
i was wondring is it possible to store words in XML format instead of in an RDBMS ?
if so , for about 60,000 words is it time effective when we use xml as data storage?
another question which i have is about searching mechanism in XML files , is there some package wich
make it possible to have an efficence searching or not?

Overall my question is about choosing XML or RDBMS for storing a dictionary data .
if XML what library should i use?
if
is there any embeded Database wich support Unicode for my porpose , if XML is not suitable .

Thank you very much.
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12769
    
    5
Personally I would use an XML document for the data storage because it will be easy to edit with a text editor and easy to reformat for various purposes with XSLT and other XML tools.
You would want to pull this document into an application and build a lookup structure optimized for speed and flexibility. The Java collections such as HashMap are handy for this.

Consider adding phonetic code lookup if there is any chance of users not always using the right spelling. I have an example of phonetic lookup from a 59,956 word dictionary
here.
Bill
Ramina Nibilian
Ranch Hand

Joined: Mar 30, 2005
Posts: 65
Hi
That is fantastic ,i mean the speed
Its very fast ,do you used XML for its storage?
How you search all of the words such fast ? u use in memory table ?or somethiing like that ?

can you pleas tell me more about , how you have done it ?
does your database and algorithm are open for using in OSS projects ?
I saw that you used some pice of JAkarta projects.
what about other stuff ?

sorry for too much question.
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12769
    
    5
In that example, the words are read from a flat file when the servlet first starts, the phonetic code is computed and then used as a key for a hashtable. Since more than one word can give a particular code, the value stored in the table is an ArrayList holding references to the original Strings. The data structures stay in memory. Hashmap lookup is indeed very fast, much faster than a DB query.
You are welcome to use the code however you want, but note that the Jakarta Commonas project has the CODEC collection of tools that includes the Metaphone algorithm and other useful goodies. Lawrence Philips wrote the original Metaphone implementation (in frustation with the Soundex algorithm I think.)
Bill
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: seeking your advice in choosing XML or RDBMS for Storing data