This week's book giveaway is in the OO, Patterns, UML and Refactoring forum. We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line! See this thread for details.
Assisting users to make correct / workable queries is indeed a large problem that takes lots of design effort. I will start with your book title problem, which is even bigger with author's names.
In order to recognize that "risl management" can't possibly be a real book title and provide your user with some help, you could create a collection of all individual words that appear in titles. Use that collection to lookup the words and warn the user that "risl" does not appear in the database.
In order to suggest possible words to the user a phonetic lookup is handy. I used the "metaphone" algorithm in an application for a legal service firm where they were always trying to figure out the alternate spellings for names from court records.
From my point of view your question can have the following three high level solutions :
1. the metaphone solution 2. the ontology solution 3. the match solution
The metaphone solution returns a list of results, based on how an English word sounds. You've already covered and read about that topic. The ontology solution returns a list of results, based on which English words have the same concepts and relationships. Maybe this is overkill for your kind of problem, but it is worth thinking about it. The match solutions returns a list of results, based on some type of match algorithm you prefer. For example you could use the trigram match.
The first two solutions are language based and you have to stick to the language. The last solution isn't related to a type of language and is easy to implement. When you choose to use the trigram match, google the word to find more information, you can relatively easy find the related book, authors you are looking for and filter out the typos people make. But ofcourse this solution doesn't create a miracle, unless you see it as a miracle. [ October 14, 2004: Message edited by: Arnold Reuser ]
Author and all-around good cowpoke
Joined: Mar 22, 2000
I just did a brief study on metaphone search.
however, I am not sure how this kind of search can help me with correcting typos..
like "risl" in metaphontic search is "RSL"
while "risk" is "PSK"
how can I match them together and suggest the user with "are you searching 'risk'?"
You are quite right that a phonetic lookup can only help with certain types of input errors. Also, as Stefan notes, each language has its own phonetic match problems. With my commercial text search program, when the user entered a word that did not appear in the database, we displayed a short list of alphabetically "close" words that did appear.
I have never run into an approach based on common keyboard errors - an interesting possibility though.