• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

R in action - Data mining and adaptive web

 
Greenhorn
Posts: 18
Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello,

I've done some research in data mining an web mining but never used R, though it is free software. I'd like to know how the R language fits in data mining tasks and, specifically, whether it is appropriate for web mining and building adaptive web applications and recommender systems (for e-commerce or e-learning) acting as a back-end (web apps built using j2ee for example).

thank you.
 
author
Posts: 33
5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Oriol,

R has very powerful support for data mining. In fact, after its graphics capabilities, this is what attracted me to the language. Rattle provides a graphical user inferface for data mining using R and there is a very easy to use interface to Weka routines called RWeka. There is a nice reference card on data mining with R available from RDataMining. You might also look at the CRAN task view on Machine Learning.

A great book on the subject is The Elements of Statistical Learning, by Hastie, Tibshirani and Friedman. A pdf version of the 5th edition is available online. R packages with code for the book are available here.

You might also take a look at Machine Learning in R, in a nutshell.

I think that you will find R appropriate for web mining and and web applications. In my own work, I use R as an exploratory data mining tool and do not build adaptive systems, but there are certainly many examples available.

 
Oriol Boan
Greenhorn
Posts: 18
Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thank you, Robert.

The info you provided is very valuable. Specially the RWeka, because I used to work with Weka in my research projects (as you can see here for instance). By the way, the book The Elements of Statistical Learning is fantastic, lots of thanks!

I think that you will find R appropriate for web mining and and web applications. In my own work, I use R as an exploratory data mining tool and do not build adaptive systems, but there are certainly many examples available.



I'll search over the Internet to find some examples, but if you are aware of anyone please let me know.

Besides all this, and regarding the book "R in action" I suppose there are examples to illustrate the theory, but what kind of examples? Classical statistics or data mining too? Can you provide the Table of contents?

Thanks again.
 
Robert Kabacoff
author
Posts: 33
5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Oriol,

You can get the table of contents and a PDF of the first chapter through the "R in Action" link below. I cover most of the classical and nonparametric statistical methods, along randomization and bootstrapping approaches for small or highly non-normal data, principal components and factor analysis, linear and logistic regression, power analysis, and advanced methods for dealing with missing data. There is a heavy emphasis on visualizing data and every technique is accompanied with examples of graphs that are useful for understanding results. My goal is to get you up and running in R quickly, help you avoid the painful learning curve people often experience, and give you a sense of what R can do.

I cover some methods used in data mining (e.g., linear and generalized linear regression techniques, assessing predictor importance, predicting categorical and count outcomes, and visualizing complex multivariate data). Others (e.g., cluster analysis, neural networks, classification and regression trees, support vector machines) are not covered, but should be easy to locate and learn after reading the book.

Hope this helps.
 
reply
    Bookmark Topic Watch Topic
  • New Topic