Chapter 3 Software

The course makes heavy use of the R statistical programming language. Details on downloading and installing R and the development environment, R Studio, for either PC or Mac, are provided below.

There will be many people in the course for whom R is a new experience. I assume no prior R work in the course. You will know a fair amount of R (and some other things, too) after taking the course, though. We’ll also be using the Markdown tool within R Studio. R Markdown will be taught in our class, and can be used to generate reproducible reports that appear as .html files or Word documents, just to give two examples.

3.1 Instructions for Installing R and R Studio

R and R Studio are two different things, but each is free software.

Complete instructions, with a step-by-step walkthrough, are available at https://github.com/THOMASELOVE/431/blob/master/software-installation-431.md

If you need more help, you might look at this terrific resource for Installing R and RStudio from Jenny Bryan and the STAT 545 project. These are the people responsible for the great Happy Git with R project, which is worth your time, too, if you intend to use Git and GitHub. (Everyone will in 432.)

In brief, the steps you need to take for 431 are:

  1. Download and install the latest version of R (version 3.4.1 or later) at http://cran.case.edu/ or https://cran.r-project.org/.
  2. Download and install the preview version of R Studio (version 1.1.345 or later) at https://www.rstudio.com/products/rstudio/download/preview/.
  3. Install some R packages - an R “package” is a collection of functions, data, and documentation that extends the capabilities of R, and is the critical way to get R doing interesting work. To install the packages for our course, open R Studio and run these commands.
pkgs <- c("aplpack", "arm", "babynames", "boot", "car", "devtools", "Epi", 
          "faraway", "forcats", "foreign", "gapminder", "GGally", "ggjoy", 
          "gridExtra", "Hmisc", "knitr", "lme4", "markdown", "MASS", 
          "mice", "mosaic", "multcomp", "NHANES", "pander", "psych", 
          "pwr", "qcc", "rmarkdown", "rms", "sandwich", "survival", 
          "tableone", "tidyverse", "vcd", "viridis")

install.packages(pkgs)

3.2 Getting Started with the Software, once you’ve installed

  1. Dr. Love’s document Getting Started with R might be a good first step. This is basically a demonstration of how to use these tools to actually analyze data.
  2. Dr. Love also prepared a downloadable template for your first few R Markdown attempts. Get it by downloading the data and code for the course at https://github.com/THOMASELOVE/431data. Click on the green Clone or download button, and then select Download ZIP to obtain a Zip file of all posted materials.
  3. We can also recommend Chester Ismay’s Getting Used to R, RStudio and R Markdown as an introduction to the basics.
  4. Dr. Love will demonstrate the use of R, R Studio and R Markdown in class, starting with Class 2.
  5. Dr. Love’s Course Notes are a source of many examples.

3.3 Why do we teach R, instead of SPSS or SAS or whatever, in 431-432?

Because it is by far the better choice for what we’re trying to do, which is to help you become effective data scientists. And effective scientists, period.