Chapter 1 Course Description
PQHS 431 (cross-listed as, CRSP 431 and MPHP 431) is the first half of a two-semester sequence (with PQHS 432) focused on modern data analysis and advanced statistical modeling, with a practical bent and as little theory as possible. We emphasize the key role of thinking hard, and well, about design and analysis in research.
The course is formally titled Statistical Methods in Biological & Medical Sciences, Part 1. A more accurate title is Data Science for Biological, Medical or Health Research.
We’ll learn about managing and visualizing data, building models and making predictions, and other “data science” activities. This highly applied course focuses on modern, more than classical, tools for learning from data. We’ll learn a lot of R, and we’ll use R Studio and Markdown as tools to help make R work better, and perform our research in replicable ways.
1.1 431 in Three Parts
- Part A is basically August/September and is about Visualizing Data.
- Part B happens in October, and is about Making Comparisons.
- Part C is in November/December. It’s about Building Regression Models.
1.2 What do we want people to learn in the 431-432 sequence?
- Using modern data science tools to import, tidy/manage, explore (through transforming, visualizing, and modeling) and communicate about data.
- Thinking hard, and well, about design and analysis in scientific research. We students to see the value of statistical thinking throughout the process of doing scientific research.
- Programming in R sufficient to accomplish the tasks above, with enough self-sufficiency afterwards to be able to debug and use new R tools without substantial troubleshooting help from others.
- The importance of replicable research, and facility and practice in open source tools (R Markdown, and [new] GitHub, too, for EPBI students) to do it, all the time, as a matter of course.
- Sufficient background in the practical issues regarding linear and generalized linear models (a big example: missing data) to permit them to have a starting place for meaningful applied work / consulting, particularly in terms of making comparisons to address several types of questions (exploratory, predictive, inferential, and causal, in particular.)
1.3 What do we assume you know before you take the course?
Not much. Useful prior experience includes training/experience in statistics, coding/programming and biology/biomedical science. We expect most people will have some experience in one or two of these areas, but very few have all three.
- Some students have lots of prior training in statistics. But there are many students in the class with no statistical training at all that they use regularly. We assume only that everyone knows what an average is, and has some sense of why statistics might be useful to them in their chosen field.
- Some students have lots of prior coding and programming experience, including experience with R. Some have never written a line of code in their life. We assume only that everyone is willing to learn how to do modern statistical work, and that means writing computer code, but that some people will be starting from nothing.
- Some students have lots of prior experience with biological and biomedical science, and know a lot of useful things in those areas which relate directly to our work. Others have zero experience in this area, and will learn a lot from their colleagues in this regard. We assume only that everyone is willing to learn, and put in some effort to do so.
People take this course with a wide range of backgrounds and a common interest in using data effectively in research related to biology, health or medicine. There will be multiple people in the class who are years away from their last statistics class, and the vast majority of students will have no prior experience using R, or any meaningful recollection of using statistical software. The pace can be brisk at times, but all CWRU students who feel up to it are welcome, regardless of their field of study or prior experience.
1.4 What will we learn in 432?
If you have specific questions about 432 not addressed here, just ask Dr. Love.