431 Project B Instructions
What is Project B?
Project B is the second of two real data science projects you’ll be doing this semester. It involves the completion of four tasks, which you’ll start working on at the start of November.
- You will complete a Registration Form to obtain my approval for what you propose, let me know if you’re working with a partner, and schedule your oral presentation.
- You will build Quarto and HTML reports describing your work.
- You (and your partner, if applicable) will present your project sometime between 12-06 and 12-11 to Dr. Love in his office in person or via Zoom. These will be scheduled immediately after the Project B Registration Forms have been submitted.
- Finally, you will complete a Self-Evaluation form.
What’s on this Website
- The Data: Instructions on getting data for Project B
- You’ll either use data from NHANES, or from some other source.
- Registration information proposing your Project B.
- This involves completing a Google Form by the deadline in mid-November specified on the Course Calendar. In this form, you will:
- specify whether or not you are working with a partner
- tell us a little about the data source (NHANES or other) you intend to use
- provide options for when you can give your oral presentation, and whether you prefer to do so in person or via Zoom
- This involves completing a Google Form by the deadline in mid-November specified on the Course Calendar. In this form, you will:
- Instructions for Study 1
- You’ll find information on required Study 1 analyses
- We also provide detailed Study 1 report specifications
- You’ll also find a Study 1 sample report
- Instructions for Study 2
- You’ll find information on required Study 2 analyses
- We also provide detailed Study 2 report specifications
- You’ll also find a Study 2 sample report
- Self-Evaluation Form for Project B
- If you work with a partner, each of you submits this form separately.
- A Checklist of the tasks that need to be accomplished for Project B, which also includes some details on the oral presentation you’ll give to Dr. Love in December.
- A Tip Sheet of about 20 things that have come up in the past that are worth your attention as you prepare your final materials for presentation and submission.
- The top menu also provides links to contact us, and to the 431 home page.
All of the material you need (from a statistical and coding perspective) to do Project B has been or will be covered in our first 24 classes (i.e. immediately before the Thanksgiving Break), as well as in the Course Book Chapters 1-22 and Labs 1-6.
Project B Deliverables
- You will complete a Registration Form to obtain my approval for your proposed work, let me know if you’re working with a partner, and schedule your oral presentation, by the (mid-November) deadline on the Course Calendar.
- You (and your partner, if applicable) will present your project to Dr. Love in his office. Details on the Oral Presentation are found in the Checklist menu above. Presentations will be scheduled on December 6-11 using the Registration Form.
- You will build two Quarto and HTML reports (separate reports for Study 1 and Study 2) by the Project B Portfolio deadline in the Course Calendar.
- If you’re not using NHANES data, you’ll also submit your data to Dr. Love at that time.
- Finally, you will complete a Self-Evaluation form, by the Project B Portfolio deadline in the Course Calendar.
Partnerships?
You can work alone, or with one other person on this project. If you work as a pair, you will commit to that when you register for the project. Each of you will receive the team grade for the project reports, and an individual grade for the other components of the project.
The Data
You will work with the same data source for Study 1 and for Study 2, and these data will be developed either from NHANES or from another public source that you identify.
- You will find detailed instructions regarding the use of NHANES data for Project B here.
- If you want to use other data, you’ll need it to meet some specifications we’ll describe, and you’ll have to get Dr. Love’s permission when you register your project.
- Since most people consider working with NHANES data to be easier, we will award four extra points to projects which use non-NHANES data.
Study 1
- Study 1 is about making descriptive and exploratory comparisons and summaries of data. It’s not about building sophisticated statistical models.
- You will ingest, merge and clean the data in R, then select variables to complete any four out of five potential analyses, as described in these instructions.
- You can do all five analyses if you like (as preparation for Quiz 2, for instance) but you will only present four in your report. No bonus credit for doing all five analyses.
- Dr Love has developed Study 1 Report Specifications and a Study 1 Sample Report which should guide your eventual submitted Study 1 report.
Study 2
- Study 2 is about building a model and making predictions. You will complete all elements of a data science project designed to create a statistical model for a quantitative outcome, then use it for prediction, and assess the quality of those predictions.
- Study 2 involves working with data from the same source that you used for Study 1. Again, you will work through all cleaning and data management requirements in your Study 2 report.
- Study 2 involves the prediction of a quantitative outcome using a key predictor and some additional predictors in two linear regression models, and then comparing those two models.
- All of the material you need (from a statistical and coding perspective) to do these analyses has been or will be covered in our first 24 classes and in the Course Notes.
- Dr Love has developed Study 2 Report Specifications and a Study 2 Sample Report which should guide your eventual submitted Study 2 report.
Grading
Project B will be graded by Dr. Love on a scale from 0-150 points.
- On-time successful completion of the Registration Form is worth 15 points.
- The two study reports (Study 1 and Study 2) due at the final Project B deadline are worth a combined 60 points.
- The oral presentation is also worth 60 points. Details on the Oral Presentation are found in the Checklist.
- The self-evaluation is worth 15 points.
- Late work on Project B is unacceptable. All deadlines are in the Course Calendar.
Dr. Love will provide no written feedback on your Project B work. The grading timeline is simply too tight on my end. I apologize in advance.
Questions?
If you have questions, let us know about them on Campuswire using the projectB folder, or speak with Dr. Love before or after class, or discuss them with the TAs during office hours.