• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Taming Text

 
Ranch Hand
Posts: 572
2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi ,
The book uses a question answer look up system as an example project with Mary Shelly's Frankenstein as the source text.
Are the technologies and techniques in the book applicable to a web based search engine ?
I'm thinking in terms of amount of input text and performance issues/limitations.

Thanks,
Paul
 
Author
Posts: 8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Paul,

The Frankenstein example in the first chapter is really just a toy to get people thinking about the problem space. Chapter 8 contains a system that is a few levels up, but still not production ready, IMO. I would suggest that the concepts and basic principles are applicable for a web-based engine, but there is a whole lot more engineering and capabilities that need to go into a system in order to make it effective in that area. I would say, it is a bit closer to ready if you are looking for a bit smaller scale, but you still have a lot of work to do, as the example really only handles simple fact-based questions and only returns a window around the candidate answer.

As for performance at web scale, you often will need leverage some type of distributed text analysis pipeline up front to handle the incoming documents.

HTH,
Grant
 
paul nisset
Ranch Hand
Posts: 572
2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks.

I was thinking about different use cases for text search applications.
It is a particularly big problem when it comes to company documentation . The answer is in there ....somewhere.
 
Grant Ingersoll
Author
Posts: 8
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
It probably is closer to ready for company documentation, intranet, but still a non-trivial exercise. What do you have in place for search? I'd probably start there first.
 
Or we might never have existed at all. Freaky. So we should cherish everything. Even this tiny ad:
a bit of art, as a gift, that will fit in a stocking
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic