Win a copy of The Java Performance Companion this week in the Performance forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

==> need search algorithm help

 
Frank king
Greenhorn
Posts: 10
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have a database includes list of book names and price.

when key in a book name, system will return it selling price.

however, if user keyed in name wrongly, for exmaple. "risl management" instead of real name "risk management", my system won't know how to handle this type of problem.

another problem is : if user doesn't key in the full name of the book, how the system could know which book the user is looking fot?

any search algorithm can solve this kinds of problems??

thanks, waiting for your help asap.
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13064
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Assisting users to make correct / workable queries is indeed a large problem that takes lots of design effort. I will start with your book title problem, which is even bigger with author's names.

In order to recognize that "risl management" can't possibly be a real book title and provide your user with some help, you could create a collection of all individual words that appear in titles. Use that collection to lookup the words and warn the user that "risl" does not appear in the database.

In order to suggest possible words to the user a phonetic lookup is handy. I used the "metaphone" algorithm in an application for a legal service firm where they were always trying to figure out the alternate spellings for names from court records.

The Apache SWF Commons CODEC toolkit has several phonetic algorithms.

You can also get my version of metaphone in java from my web site.
Bill
 
Frank king
Greenhorn
Posts: 10
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A great thanks for your advice:

I just did a brief study on metaphone search.

however, I am not sure how this kind of search can help me with correcting typos..

like "risl" in metaphontic search is "RSL"

while "risk" is "PSK"

how can I match them together and suggest the user with "are you searching 'risk'?"

thanks
 
Stefan Wagner
Ranch Hand
Posts: 1923
Linux Postgres Database Scala
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I wouldn't try to help too much in such cases, because you wouldn't find a lot of typos, which create words, contained in your list.

Spelling algos depend on a language - don't they?
And books often contain foreign words - more often authors.
Without knowing the language, your chance of getting a good result is low.

Beaujolais 1994-1998, Dr. Heiner Brandt, Suhrkamp 1999
Quod erat demonstrandum, Vaclav Kavlyszcinsky, rororo 2001

Not found, unknown word: 'risl'
is enough information for the user, to requery again.

Users too stupid for that aren't able to read books.
 
Arnold Reuser
Ranch Hand
Posts: 196
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
From my point of view your question can have the following three high level solutions :

1. the metaphone solution
2. the ontology solution
3. the match solution

The metaphone solution returns a list of results, based on how an English word sounds. You've already covered and read about that topic.
The ontology solution returns a list of results, based on which English words have the same concepts and relationships. Maybe this is overkill for your kind of problem, but it is worth thinking about it.
The match solutions returns a list of results, based on some type of match algorithm you prefer. For example you could use the trigram match.

The first two solutions are language based and you have to stick to the language. The last solution isn't related to a type of language and is easy to implement. When you choose to use the trigram match, google the word to find more information, you can relatively easy find the related book, authors you are looking for and filter out the typos people make. But ofcourse this solution doesn't create a miracle, unless you see it as a miracle.
[ October 14, 2004: Message edited by: Arnold Reuser ]
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13064
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I just did a brief study on metaphone search.

however, I am not sure how this kind of search can help me with correcting typos..

like "risl" in metaphontic search is "RSL"

while "risk" is "PSK"

how can I match them together and suggest the user with "are you searching 'risk'?"

You are quite right that a phonetic lookup can only help with certain types of input errors. Also, as Stefan notes, each language has its own phonetic match problems. With my commercial text search program, when the user entered a word that did not appear in the database, we displayed a short list of alphabetically "close" words that did appear.

I have never run into an approach based on common keyboard errors - an interesting possibility though.

Bill
 
Stefan Wagner
Ranch Hand
Posts: 1923
Linux Postgres Database Scala
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
and keyboards differ from user to user too...
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic