Setting Up R
These Notes make extensive use of
- the statistical software language R, and
- the development environment RStudio,
both of which are free, and you’ll need to install them on your machine. Instructions for doing so will be found on the course website.
If you need a gentle introduction, or if you’re just new to R and RStudio and need to learn about them, we encourage you to take a look at https://moderndive.com/, which provides an introduction to statistical and data sciences via R at Ismay and Kim (2022).
Quarto
These notes were written using Quarto, which we’ll learn to use in 431. Quarto, like R and RStudio, is free and open source.
Quarto is described as an scientific and technical publishing system for data science, which lets you (among many other things):
- save and execute R code
- generate high-quality reports that can be shared with an audience
The Quarto Get Started page which provides an overview and quick tour of what’s possible with Quarto.
Another excellent resource to learn more about Quarto is the Communicate section (especially the [Quarto chapter]https://r4ds.hadley.nz/quarto.html)) of Hadley Wickham and Grolemund (2023).
R Packages
At the start of each chapter that involves R code, I’ll present a series of commands I run to set up R to use several packages (libraries) of functions that expand its capabilities, make a specific change to how I want R output to be displayed (that’s the comment = NA
piece) and sets the theme for most graphs to theme_bw()
. A chunk of code like this will occur near the top of any Quarto work.
For example, this is the setup for one of our early chapters that loads four packages.
You only need to install a package once, but you need to reload it (using the library()
function) every time you start a new session. I always load the package called tidyverse
last, since doing so avoids some annoying problems.
The Love-boost.R
script
In October, when we start Part B of the course, we’ll use some special R functions I’ve gathered for you in a script called Love-boost
. I’ll tell R about that code using the following command…
source("data/Love-boost.R")
The Love-boost.R
script includes four functions:
bootdif
saifs.ci
twobytwo
retrodesign
Packages Used in these Notes
A list of all R packages we want you to install this semester (which includes some packages not included in these Notes) is maintained at our course web site.
Package | Parts | Key functions in the Package |
---|---|---|
arm |
C | – |
boot |
B | – |
broom |
A, B, C |
tidy , glance , augment (part of tidymodels ) |
car |
A, C |
boxCox , powerTransform
|
Epi |
B | twoby2 |
fivethirtyeight |
Appendix | source of data |
GGally |
A, C | ggpairs |
ggrepel |
C | – |
ggridges |
A, B | – |
ggstance |
A | – |
gt |
A | for presenting tables |
gtsummary |
A | tbl_summary |
Hmisc |
A, B, C |
describe and others |
janitor |
A, B, C |
tabyl and others |
kableExtra |
A |
kbl , kable_stylings
|
knitr |
A, B, C | kable |
lvplot |
A | geom_lv |
mice |
C | – |
modelsummary |
A, C | modelsummary |
mosaic |
A, B, C |
favstats , inspect
|
naniar |
A |
n_miss , miss_case_table , gg_miss_var
|
NHANES |
A | source of data |
palmerpenguins |
A | source of data |
patchwork |
A, B, C | for combining/annotating plots |
psych |
A, B | describe |
pwr |
B | – |
rms |
C | – |
sessioninfo |
Appendix | – |
simputation |
A | various imputation functions |
summarytools |
A |
descr , dfSummary
|
tidyverse |
A, B, C, Appendix | dozens of functions |
vcd |
B | – |
visdat |
A |
vis_dat , vis_miss
|
The tidyverse
The tidyverse
package is actually a meta-package which includes the following core packages:
-
ggplot2
for creating graphics -
dplyr
for data manipulation -
tidyr
for creating tidy data -
readr
for reading in rectangular data -
purrr
for working with functions and vectors -
tibble
for creating tibbles - lazy, surly data frames -
stringr
for working with data strings -
forcats
for solving problems with factors
Loading the tidyverse with library(tidyverse)
loads those eight packages.
Installing the tidyverse also installs several other useful packages on your machine, like glue
and lubridate
, for example. Read more about the tidyverse
at https://www.tidyverse.org/