Wow - no easy questions on this list . Let me take a shot.

When it comes to data analysis, R's biggest competitors are Excel, SAS, IBM SPSS, and Stata. With the exception of R, all are proprietary. I use each regularly and each has strenghts and weaknesses compared with R and with each other. Let me know if you want me to go into them.

In the open source world, I am hard pressed to think of competitors. Nothing matches R's breadth of coverage or graphic capabilities. In addition to the wide array of built-in functions, there are thousands of user-contributed packages of functions easily available through CRAN.

In the data mining space, there is a good list of free (though not necessarily open source) software available from KDnuggets. Orange, RapidMiner, Rattle, TANAGRA, and Weka are popular.

In area of exploratory graphics, MANET is well known. However, much of its functionality is now available in R through the iplots package.

A good general link for free statistical software is avaible from StatPages.net. It is comprehensive, but some of the software is quite dated now.

If you are interested in Orange, Red-R extends Orange to communicate with the R interpreter using the Python-R interface rpy. I have not worked with it.

Dragan Stankovic
Ranch Hand

Joined: Oct 14, 2008
Posts: 33

posted

0

Thanks for reply.

There is no need to go into each one you listed, but I am just curious how it compares to Excel.

Robert Kabacoff
author
Greenhorn

Joined: Mar 28, 2011
Posts: 25

posted

0

Sure thing, Dragan.

Excel is an excellent tool for many things. Data analysis is not among them. It is great for editing rectangular arrays of data, developing "what if" scenarios, developing pivot tables, and creating basic graphs and tables that are easily pasted (and linked) into Word and PowerPoint. On the other hand, it has only the most rudimentary ability to analyze data. There are add-ins that can increase this ability (e.g., XLMiner, WinStat, Microsoft Data Analysis Toolpak, StatTools) but it still remains very basic compared with R. The graphs are basic, but attractive and easy to modify and annotate. In particular, Excel is underpowered (or unable) when it comes to displaying variable distributions, spatial and geographic data (e.g., maps), multi-dimensional graphs (3D and higher), network graphs, or lattice graphs. Lattice graphs are an important way to display relationships among variables, controlling for (or conditioned on) other variables.

R excels (no pun intended) at every manner of data analysis, statistics, model building and predictive analytics. If there is a way to analyze and understand data, someone has created a package for it. It can create a much broader array of graph types for visualizing data. It runs on every platform imaginable (I've seen directions for installing it on an iPhone - which I don't think is a great idea). On the other hand, it lacks a good data editor, is harder to link into Word or PowerPoint, and has a much more significant learning curve.

Actually, there is some value in using Excel and R together, to overcome the limitations of each. The package RExcel allows you to access the functionality of R from within Excel workbooks.