aspose file tools*
The moose likes Java in General and the fly likes ==> need search algorithm help Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of EJB 3 in Action this week in the EJB and other Java EE Technologies forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "==> need search algorithm help" Watch "==> need search algorithm help" New topic
Author

==> need search algorithm help

Frank king
Greenhorn

Joined: Oct 10, 2004
Posts: 10
I have a database includes list of book names and price.

when key in a book name, system will return it selling price.

however, if user keyed in name wrongly, for exmaple. "risl management" instead of real name "risk management", my system won't know how to handle this type of problem.

another problem is : if user doesn't key in the full name of the book, how the system could know which book the user is looking fot?

any search algorithm can solve this kinds of problems??

thanks, waiting for your help asap.
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12678
    
    5
Assisting users to make correct / workable queries is indeed a large problem that takes lots of design effort. I will start with your book title problem, which is even bigger with author's names.

In order to recognize that "risl management" can't possibly be a real book title and provide your user with some help, you could create a collection of all individual words that appear in titles. Use that collection to lookup the words and warn the user that "risl" does not appear in the database.

In order to suggest possible words to the user a phonetic lookup is handy. I used the "metaphone" algorithm in an application for a legal service firm where they were always trying to figure out the alternate spellings for names from court records.

The Apache SWF Commons CODEC toolkit has several phonetic algorithms.

You can also get my version of metaphone in java from my web site.
Bill


Java Resources at www.wbrogden.com
Frank king
Greenhorn

Joined: Oct 10, 2004
Posts: 10
A great thanks for your advice:

I just did a brief study on metaphone search.

however, I am not sure how this kind of search can help me with correcting typos..

like "risl" in metaphontic search is "RSL"

while "risk" is "PSK"

how can I match them together and suggest the user with "are you searching 'risk'?"

thanks
Stefan Wagner
Ranch Hand

Joined: Jun 02, 2003
Posts: 1923

I wouldn't try to help too much in such cases, because you wouldn't find a lot of typos, which create words, contained in your list.

Spelling algos depend on a language - don't they?
And books often contain foreign words - more often authors.
Without knowing the language, your chance of getting a good result is low.

Beaujolais 1994-1998, Dr. Heiner Brandt, Suhrkamp 1999
Quod erat demonstrandum, Vaclav Kavlyszcinsky, rororo 2001

Not found, unknown word: 'risl'
is enough information for the user, to requery again.

Users too stupid for that aren't able to read books.


http://home.arcor.de/hirnstrom/bewerbung
Arnold Reuser
Ranch Hand

Joined: Nov 20, 2003
Posts: 194
From my point of view your question can have the following three high level solutions :

1. the metaphone solution
2. the ontology solution
3. the match solution

The metaphone solution returns a list of results, based on how an English word sounds. You've already covered and read about that topic.
The ontology solution returns a list of results, based on which English words have the same concepts and relationships. Maybe this is overkill for your kind of problem, but it is worth thinking about it.
The match solutions returns a list of results, based on some type of match algorithm you prefer. For example you could use the trigram match.

The first two solutions are language based and you have to stick to the language. The last solution isn't related to a type of language and is easy to implement. When you choose to use the trigram match, google the word to find more information, you can relatively easy find the related book, authors you are looking for and filter out the typos people make. But ofcourse this solution doesn't create a miracle, unless you see it as a miracle.
[ October 14, 2004: Message edited by: Arnold Reuser ]
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12678
    
    5
I just did a brief study on metaphone search.

however, I am not sure how this kind of search can help me with correcting typos..

like "risl" in metaphontic search is "RSL"

while "risk" is "PSK"

how can I match them together and suggest the user with "are you searching 'risk'?"

You are quite right that a phonetic lookup can only help with certain types of input errors. Also, as Stefan notes, each language has its own phonetic match problems. With my commercial text search program, when the user entered a word that did not appear in the database, we displayed a short list of alphabetically "close" words that did appear.

I have never run into an approach based on common keyboard errors - an interesting possibility though.

Bill
Stefan Wagner
Ranch Hand

Joined: Jun 02, 2003
Posts: 1923

and keyboards differ from user to user too...
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: ==> need search algorithm help
 
Similar Threads
Searching an XML document
need MYSQL searching help
search algorithm help
XSLT merging help
Storing a password to encrypt/decrypt sensitive data.