Section 7 Required Texts

7.1 Professor Love’s Materials

The main text is a set of Notes for the course, maintained by Professor Love, titled Data Science for Biological, Medical and Health Research: Notes for PQHS 431. A link to these notes will be made available to you at the first class. Professor Love revises the Notes every year, and so they will appear in fits and starts as the semester progresses.

Although these Notes share some of the features of a textbook, they are neither comprehensive nor completely original. The main purpose is to give 431 students a set of common materials on which to draw during the course, providing a series of examples using R to work through issues that are likely to come up during the semester, and in later work.

In addition, slides and video recordings from each of Professor Love’s lectures, plus other in-class materials from each session of the class will be posted for your use in a timely fashion throughout the semester.

Once class begins, access all materials at the main course website.

7.2 Two Books To Purchase

As mentioned, we’ll read two books you’ll need to purchase (combined price is $18 to $33):

  • David Spiegelhater The Art of Statistics: How to Learn from Data, published in the US by Basic Books in 2019, available at Amazon, for instance, for around $18 (Kindle) or $23 (hardcover).
    • The book’s website contains R code, corrections and other materials.
    • Either the UK or US version of the book is fine. From Dr. Spiegelhalter’s website: The Art of Statistics is a Pelican book published by Penguin in March 2019 in the UK, and by Basic Books in the US in September 2019: the books are identical apart from the subtitle (the UK subtitle is Learning from Data) and cover. The UK Pelican paperback was published in February 2020.
  • Jeffrey Leek, The Elements of Data Analytic Style, available at https://leanpub.com/datastyle (minimum price is free, suggested price is $10).

7.3 Three Books to Download

There are three other books that you will definitely want to download during the semester. All are freely available at the links below.

  1. R for Data Science by Garrett Grolemund and Hadley Wickham
  2. Biostatistics for Biomedical Research (pdf) by Frank E. Harrell Jr and James C Slaughter
  3. Modern Dive: Statistical Inference via Data Science (A Modern Dive into R and the Tidyverse) by Chester Ismay and Albert Y. Kim.

7.4 Key Articles and Posts

While I will recommend dozens, perhaps hundreds of articles, blog posts and the like to you over the course of the year, these are especially important in 431.

  1. Several of the guides prepared by Jeff Leek and his group, including:
  2. Data Organization in Spreadsheets by Karl W. Broman and Kara H. Woo in The American Statistician, 2018 Special Issue on Data Science, or you can read the PeerJ preprint version.
  3. Project-oriented workflow at tidyverse.org from Jenny Bryan.
  4. From the Ten Simple Rules series at PLOS Computational Biology:
  5. Statistical Inference in the 21st Century: A World Beyond p < 0.05 from 2019 in The American Statistician
  6. The American Statistical Association’s 2016 Statement on p-Values: Context, Process and Purpose.

See the main course website for other recommendations as the semester goes on.