Appendix D — Data Sets Used in this Book
D.1 Data Sets Provided on our Web Site
See the repository at https://github.com/THOMASELOVE/431-data.
| Data Set | File | Type | Loaded | Source |
|---|---|---|---|---|
| bloodlead | .csv |
Comma-separated text | Section 5.2 | J. Statistics Education |
| bodyfat | .csv |
Comma-separated text | Section 19.2 | Kaggle |
| BPX_I | .xpt |
SAS transport file | Section 12.2 | NHANES 2015-16 |
| cbaths | .txt |
Tab-delimited text | Section 8.2 | Data and Story Library |
| cle_nbd | .csv |
Comma-separated text | Section 4.2 | Census Reporter & NEOCANDO |
| coasters | .csv |
Comma-separated text | Section 10.2 | Roller Coaster Data Base |
| countries | .csv |
Comma-separated text | Section 21.2 | WHO and others |
| craters | .sav |
SPSS data set | Section 11.2 | Data and Story Library |
| darwin | .Rds |
R data set | Section 7.2 | UC Irvine Repository |
| DEMO_I | .xpt |
SAS transport file | Section 12.2 | NHANES 2015-16 |
| fev_ros | .csv |
Comma-separated text | Section 17.2 | Vanderbilt Data |
| nations | .csv |
Comma-separated text | Section 16.2 | WHO and others |
| nnyfs | .Rds |
R data set | Appendix C | NNYFS at CDC |
| park_rct | .xlsx |
Excel worksheet | Section 6.3 | NEJM article |
| plasma | .csv |
Comma-separated text | Section 20.2 | Vanderbilt Data |
| storage | .Rds |
R data set | Section 9.2 | Cleveland Clinic |
| supraclav | .dta |
Stata data set | Section 15.2 | Cleveland Clinic |
| tattoos | .txt |
Tab-delimited text | Section 14.2 | Data and Story Library |
D.2 Data Sets imported from R Packages
| Data Set | R Package | Loaded | HTML Link |
|---|---|---|---|
| bechdel | fivethirtyeight |
Section C.1 | Analysis using the Tidyverse |
| childcare_costs | tidytuesdayR |
Section 18.2 | Github for Tidy Tuesday 2023-05-09 |
| counties | tidytuesdayR |
Section 18.2 | Github for Tidy Tuesday 2023-05-09 |
| penguins | palmerpenguins |
Section 2.2 | palmerpenguins |
| strep_tb | medicaldata |
Section 13.2 | medicaldata |