The book uses a question answer look up system as an example project with Mary Shelly's Frankenstein as the source text.
Are the technologies and techniques in the book applicable to a web based search engine ?
I'm thinking in terms of amount of input text and performance issues/limitations.
The Frankenstein example in the first chapter is really just a toy to get people thinking about the problem space. Chapter 8 contains a system that is a few levels up, but still not production ready, IMO. I would suggest that the concepts and basic principles are applicable for a web-based engine, but there is a whole lot more engineering and capabilities that need to go into a system in order to make it effective in that area. I would say, it is a bit closer to ready if you are looking for a bit smaller scale, but you still have a lot of work to do, as the example really only handles simple fact-based questions and only returns a window around the candidate answer.
As for performance at web scale, you often will need leverage some type of distributed text analysis pipeline up front to handle the incoming documents.
Joined: May 13, 2009
I was thinking about different use cases for text search applications.
It is a particularly big problem when it comes to company documentation . The answer is in there ....somewhere.
Joined: Jan 03, 2013
It probably is closer to ready for company documentation, intranet, but still a non-trivial exercise. What do you have in place for search? I'd probably start there first.
I’ve looked at a lot of different solutions, and in my humble opinion Aspose is the way to go. Here’s the link: http://aspose.com