the book looks like it covers a lot of ground. How does it compare to the O'reilly Press "Programming Collective Intelligence"?
Joined: May 07, 2008
The book is really meant for developers (basic level of Java understanding helps) who are looking to add intelligence to their applications, especially user-centric Web 2.0 applications. A lot of work has been done by the open-source community in Java in the areas of text processing and search (Lucene), data mining (WEKA), web crawling (Nutch), and data mining standards (JDM). This book leverages these frameworks; presents examples and develops code that you can directly use in your Java application.
This is a practical book and I present a holistic view on things required to apply these techniques in the real-world. Consequently, the book discusses the architectures for implementing intelligence � you will find lots of diagrams, especially UML diagrams, lots of screen shots from well-known sites, in addition to code listings, and even database schema designs.
There are a plethora of examples. Typically, concepts and the underlying math for algorithms is explained via examples with detailed step-by-step analysis. Accompanying the examples is Java code that demonstrates the concepts by implementing the concept and/or using open-source frameworks.
There are a number of exciting topics that you will find interesting and are typically not covered by other books: harvesting information from the blogosphere, analyzing content � especially user-generated content, intelligent web crawling, intelligent search, building recommendation systems. In the last chapter, I also cover three real-world examples of personalization by Amazon, Google News, and Netflix � the BellKor solution from the Netflix competition is also covered. At the end of this you should be familiar with text analysis using Lucene, web crawling using Nutch, building content-based and collaborative-based recommendation engines, and data mining using WEKA and JDM.
Joined: Apr 08, 2008
Thanks for the detailed response. One thing already learnt thanks to you is about WEKA. It looks very interesting. The Java specific nature is also useful.