Section 8 Other Worthy (and Free) Resources

Many of these resources will come up again in class, but no one can keep up with all of this material. Pick things that interest you to follow up with. And I’m always eager to receive additional suggestions from students in the class. If you find a helpful resource, please suggest it to us on Piazza.

8.2 Statistics/Data Analysis/Data Science Books

  1. The Art of Data Science by Roger D. Peng and Elizabeth Matsui (book is also available with lecture videos). An earlier edition is available at bookdown
  2. Exploratory Data Analysis with R by Roger D. Peng. An earlier edition is available at bookdown
  3. Data Analysis for the Life Sciences by Rafael A. Irizarry and Michael I. Love
  4. Modern Statistics for Modern Biology by Susan Holmes and Wolfgang Huber
  5. Regression Models for Data Science in R by Brian Caffo
  6. Introduction to Data Science: Data Analysis and Prediction Algorithms with R by Rafael A. Irizarry
  7. Practical Regression and ANOVA using R by Julian J. Faraway (pdf)
  8. A First Course in Design and Analysis of Experiments by Gary W. Oehlert (pdf)

8.3 R and R Markdown Books

  1. Cookbook for R by Winston Chang
  2. Learning Statistics with R and its bookdown repository by Danielle Navarro
  3. R Programming for Data Science by Roger D. Peng. An earlier edition is available at bookdown
  4. R Markdown: The Definitive Guide by Yihui Xie, J. J. Allaire, and Garrett Grolemund
  5. R Markdown for Scientists by Nicholas Tierney
  6. R Packages by Hadley Wickham and Jenny Bryan
  7. What They Forgot to Teach You About R by Jenny Bryan and Jim Hester
  8. Advanced R by Hadley Wickham (2nd edition)
  9. Hands-On Programming with R by Garrett Grolemund

8.4 Blogs and Internet Columns

  1. Andrew Gelman and friends at Statistical Modeling, Causal Inference, and Social Science
  2. Simply Statistics by Jeff Leek, Brian Caffo, Roger Peng, Rafael Irizarry and others
  3. Frank Harrell’s Statistical Thinking blog
  4. FlowingData by Nathan Yau
  5. JunkCharts by Kaiser Fung
  6. New York Times What’s Going On in this Graph?
  7. Edward Tufte on the Web
  8. Tidy Tuesdays: A weekly data project in R from the R for Data Science online learning community
  9. FiveThirtyEight on Politics, Sports, Science & Health, Economics and Culture (Nate Silver is editor-in-chief)

8.5 Resources for Learning R

  1. I recommend the Community-Sourced Data Science Guide of resources for learning data science.
  2. RStudio Cheat Sheets are definitely worth your time. In 431, you’ll especially like:
  • Data Transformation with dplyr
  • Data Visualization with ggplot2
  • Data Import
  • R Markdown
  1. The swirl package in R can be a great help for people learning R programming and data science. Find out more about it at http://swirlstats.com/students.html
  2. UCLA’s Institute for Digital Research and Education has some great Data Analysis Examples using R (and other software.)

8.6 Videos about R and Data Science

  1. Resources from RStudio is a great source of all kinds of useful stuff. For example:
  1. Data Wrangling with R and the Tidyverse YouTube Playlist from Garrett Grolemund
  2. Hadley Wickham’s Whole Game
  3. Tidy Tuesday Screencasts from David Robinson on YouTube
  4. Hans Rosling: The Best Stats You’ve Ever Seen TED Talk from 2006.
  5. This is Statistics: Roger Peng explains in less than two minutes why statistics is an amazing field.
  6. Mona Chalabi’s TED Talk on 3 ways to spot a bad statistic, 2017.
  7. The beauty of data visualization from David McCandless at TEDGlobal 2010.
  8. Six Types of Questions You Can Ask in a Data Analysis from Roger Peng.
  9. Videos from Coursera’s 4 week course “Computing for Data Analysis” in R
  10. Learn R by Intensive Practice list of tutorials on YouTube.
  11. Learning R with humorous side projects from rstudio::conf 2020 and Ryan Timpe.
  12. R for Graphical Clinical Trial Reporting from rstudio::conf 2020 and Frank Harrell.
  13. Effective Visualizations from rstudio::conf 2020 and Miriah Meyer.
  14. State of the Tidyverse from rstudio::conf 2020 and Hadley Wickham.
  15. How R Markdown changed my life from rstudio::conf 2020 and Rob Hyndman.
  16. One R Markdown document: 14 demos from rstudio::conf 2020 and Yihui Xie.

8.7 Podcasts

  1. Risky Talk with David Spiegelhalter (author of The Art of Statistics) features conversations with the world’s top experts in risk and evidence communication addressing urgent, practical challenges: How can doctors communicate the risks and benefits of medical treatment? How should scientists communicate evidence about climate change? How can journalists make numbers meaningful to readers? How should government institutions convey important statistics?
  2. Not So Standard Deviations by Hilary Parker and Roger Peng talking about the latest in data science and data analysis in academia and industry.
  3. The Effort Report by Elizabeth Matsui and Roger Peng talking about life in the academic trenches, telling it “like it is”. Every graduate student in this course looking at a career in academia would benefit from listening.
  4. Casual Inference where hosts Lucy D’Agostino McGowan and Ellie Murray talk all things epidemiology, statistics, data science, causal inference, and public health. Sponsored by the American Journal of Epidemiology.
  5. FiveThirtyEight Model Talk where Nate Silver and the rest of the FiveThirtyEight folks get into the weeds of their models for election and sports forecasting and those of other people for things like COVID-19.
  6. More or Less: Behind the Stats from Tim Harford and BBC Radio 4
  7. Stats + Stories from the American Statistical Association and Miami University