Getting started with R

(Originally posted on February 17, 2012)

There are many guides and introductions to R, but the myriad of choices can be more daunting than helpful. Some guides are definitely better than others, and some guides are better suited for certain audiences. For example, the “Introduction to R” that is included in the software is completely impractical for getting started. I think it’s a great reference, but it’s not easy to use that document to just “get going”.

I will explain how to get R, what it is, and how to use it. Then I’ll provide links to my favorite guides. There are some great guides out there, so there’s no need to rewrite them. Also, I would advise using multiple documents because they all offer different perspectives.

Downloading and Installing R

Using Google to search for “R” actually works. Search for R and install it. The download page can be confusing, here are some tips:

  • The main R page is called “CRAN”, which stands for the “Comprehensive R Archive Network”. This is the main R site.
  • The actual R website doesn’t handle all the traffic for the program downloads, you actually use “mirror” sites that store copies of the latest installation.
  • The download link is not obvious, but it’s there. Then you choose the “mirror” site closest to you.
  • From there it should be pretty easy to find the installer for your operating system.
  • By the way, you could also download the source code for R and build it yourself. This would not be the fastest way to get started, but you may see references to building R yourself.
  • The defaults in the installation are fine.

What you get when you install R

When you install R you get the basic R application with the basic “packages” or “libraries”. These libraries are things like “stats” and “utilities” that are commonly used. If you want a library to do something more unusual, like say mixed GAM models, you’re going to install that package from within R later.

R also comes with some nice documentation. You can find this in the installation directory under “docs”. There are HTML and PDF versions of each document.

When you open R from your application shortcut, you’re actually opening the R GUI (Graphical User Interface). This GUI lets you control R, but it isn’t actually the R program. The actual R program is separate from the built in GUI, and you could access it directly from another application, from the command line (DOS prompt or Terminal on a Mac).

I mention this distinction between R and the R GUI for a reason. Because it’s possible to access R directly, there are other GUIs that make it much easier to use R than the default GUI.

How to access R

The best way to use R is through an IDE, (Integrated Development Environment). You need to install R, but then the IDE uses R for you.

Here are the main IDEs:

  • R GUI: You can skip the IDE and write R code directly in the R’s default GUI. It’s nice to be able to just open R and use it quickly, but it’s the least user friendly route.
  • R Studio: You can download a completely free IDE called R Studio. This is the best option for beginners. It provides a friendlier R experience in many ways. Two of the most important features are syntax highlighting, and project management. Once you have R installed you can download and use R Studio. It will automatically find and use your most recent R installation.
  • StatET and Eclipse: This is the best solution by far, but there is a steep learning curve. There are instructions on how to install this online, and I’ve written my own guide. However, Eclipse and the R plug-in have new versions all the time, so installation instructions are likely to become outdated quickly. Still, it’s very powerful and very nice to use (once you’re up to speed). I plan to do a post on the benefits of StatET, but in a nutshell it has smart indenting, brace and bracket matching, customizable syntax highlighting, automatic code backups, and some really nifty variable management features like highlighting and global renaming.
  • Emacs: I’ve never been interested in emacs because it’s too complex and too foreign. It might be the best option, but good luck finding an unbiased opinion. Emacs users are usually fanatically devoted, and nobody else understands them. By the way, the deal breaker for me was the complex keyboard shortcuts. The keyboard shortcuts are “compound key” shortcuts, so instead of using control + c for copy it’s control, c, w (each pressed separately, I believe).

In addition to the IDE you should get a text editor. Having a good text editor is essential for opening and viewing R files (and files from other programming languages).

There are a many choices, but I have a strong preference for SciTE (Scientific Text Editor). It’s open source, free, and has great syntax highlighting. Also, you can run other programs directly from the editor. For example, if you Python or C++ installed, you can open a code file and press F5 and it will run. So, it’s nice to have on hand beyond it’s utility for viewing R files.

Here is a link to my slightly customized version of SciTE
Here is the link to the original SciTE

How to actually get started, and the most critical resources

Here are the resources that I think are most important, listed in order of importance.

Tom Short’s R Reference Card: (original version) (newer version)

R help archive:
This is an easy way to search all of the historical help emails. If you’re just getting started then it will probably be a long time before you ask a question that someone else hasn’t already asked.

Quick R (Rob Kabacoff)
This is my favorite set of examples for getting started in R

IcebreakR (Andrew Robinson):
This is my favorite book guide to getting started in R.

R Graphics Gallery (Romain François):
This has many examples for building graphics in R, and Romain is one of the more influential R contributers.

CRAN Task Views:
Once you’re going in R, this is a good place to find packages that help you accomplish specific tasks that might be beyond the base package.

Vincent Zoonekynd’s R notes:
This site has an unbelievable quantity of examples. It’s a collection of personal notes from one individual, so don’t expect a rigorous exploration of every topic. However it’s extremely useful to find examples and inspiration.

R Bloggers:
This site is an aggregator of all things R, and probably my favorite R resource. However, it’s not a “getting started” guide. I included it because there are guides that are mentioned in the posts from time to time, and there are also a plethora of neat examples that flow through the front page.

Other reference cards:
Some of these are marginally useful, but Tom Short’s is the place to start. Still, if you’re looking for something to relate R to Numpy, Matlab, or Octave; then there are some references for that here.

Modern Applied Statistics with S (William Venables and Brian Ripley)
This is a great book that covers many advanced statistical topics in R. Don’t be fooled by the S+ in the title; R is the open source version of S+. The authors are *major* contributors to the R language.

R in a Nutshell: A Desktop Quick Reference
By Joseph Adler, published by O’Reilly
I like this book quite a bit, and I’m glad I bought it. I’ve heard people say that it tries to do too much, and doesn’t cover many topics in enough depth, but I think it’s a useful reference.


One thought on “Getting started with R

Leave a Reply

Your email address will not be published. Required fields are marked *