R is a language and environment for statistical computing and graphics. It is a GNU project that is similar to the S language and environment developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues.
R can be considered as a different implementation of S. There are some important differences:
- Code written for S runs unaltered under R
- R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering) and graphical techniques
- R is highly extensible
- R provides an Open Source route to participation in that activity
- R’s well-designed, publication-quality plots can be produced and formulated where needed, including mathematical symbols
Great care has been taken over the defaults for the minor design choices in graphics for R, but the user retains full control.
R is available as free software under the terms of the GNU General Public License from the Free Software Foundation’s in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems, including FreeBSD and Linux, Windows, and MacOS.
The R Environment
R is an integrated suite of software facilities for data manipulation, calculation, and graphical display. It includes:
- An effective data handling and storage facility
- A suite of operators for calculations on arrays, in particular matrices
- A large, coherent, integrated collection of intermediate tools for data analysis
- Graphical facilities for data analysis and display – either on-screen or on hardcopy
- A well-developed, simple, and effective programming language, which includes conditionals, loops, user-defined recursive functions, and input/output facilities
The term “environment” is intended to characterize it as a fully-planned and coherent system, rather than an incremental accretion of very specific and inflexible tools, which is frequently the case with other data analysis software.
R, like S, is designed around a true computer language. It allows users to add additional functionality by defining new functions. Much of the system is itself written in the R dialect of S, which makes it easy for users to follow the algorithmic choices made. For computationally-intensive tasks, C, C++, and Fortran code can be linked and called at run time. Advanced users can write C code to manipulate R objects directly.
Many users think of R as a statistics system. We prefer to think of it as an environment within which statistical techniques are implemented. R can be extended easily via packages. There are about eight packages supplied with the R distribution. There are many more are available through the CRAN family of internet sites covering a wide range of modern statistics.
R has its own LaTeX-like documentation format, which is used to supply comprehensive documentation, both in hardcopy and online in a number of formats.