In order to break down the task and minimize end-of-semester stress, you’ll be working on the project in four phases (see course schedule for due dates):

  1. Project proposal: Propose a data set for your project.
  2. Project EDA: Perform an exploratory data analysis using visualizations and numerical summaries.
  3. Project initial submission: Make an initial submission of your project. You will skip some of the sections for now and only complete them after we have covered inference for regression in class. After you submit your work, you will get instructor feedback.
  4. Project resubmission: Incorporate your instructor feedback from the initial submission phase, complete the remaining sections, and resubmit your project. You will only be graded on your project resubmission.

1. Project proposal

Form groups

  1. Form groups of 2-3 students.
    • Preferably all group members are in the same lab section.
    • All groups members are expected to contribute and you will all be held accountable for your contributions in peer evaluations.
    • If you need a group fill out this Google Form by Wed 2/16 at 5pm. I will assign you a group before lab
  2. Assign a group leader. The team/group leader will have a few extra administrative duties at each phase. Ex: making submissions, filling out Google Forms, etc.

Select a data set

You have two options for your data set:

  1. Choose from these 10 vetted datasets. The variables fit the requirements for this project and the data require no additional data wrangling.
  2. If you would like to use your own data, it must have:
    1. An identification variable that uniquely identifies each observation in each row.
    2. A numerical outcome variable \(y\). Note: binary outcomes variables with 0/1 values are not technically numerical.
    3. Two explanatory variables:
      1. A numerical explanatory variable \(x_1\). Note: this can be some notion of time.
      2. A categorical explanatory variable \(x_2\) that has between 3-5 levels. Note: If your data has more than 5 levels, they can be collapsed into 5 using data wrangling later.
    4. At least 50 observations/rows.

What to submit

By Thu 2/24 9pm:

  1. The group leader will create a Slack direct message (DM) that includes all 1) your group members, 2) your instructor, and 3) your lab instructor. Please ask all questions to myself and Beth here so everyone is on the same page.
  2. Only the group leader: Fill out this Google Form with your group information

2. Project EDA

To Do

  1. Work on the final_project.Rmd
    1. Download file from Moodle
    2. Knit to HTML (not PDF) and read over once
    3. Complete Sections 1 (Introduction), 2 (EDA), and 5 (Honor Code) only. You will complete the other sections later.
    4. Only group leader will make single submission on behalf of whole group (see below)
  2. Ask questions:
    1. For general project questions that would apply to all groups: ask in final_project channel in Slack
    2. For questions specific to your group: ask in group Slack DM you created in project proposal phase
  3. Be mindful of your group members
    1. Life happens, especially this semester. If something comes up, let your partners know. Do not ghost on your partners.
    2. If an issue arises, do not resort to passive aggressive electronic messages. Make a good faith attempt to say how you feel in person.
    3. If the issue is still not resolved, Slack DM Prof. Kim.
  4. Tips:
    1. Knit early, knit often
    2. For tips on how to make your HTML document look professional, consult the R Markdown tips listed in RStudio Menu Bar -> Help -> Markdown Quick Reference

What to submit

By Thu 3/10 9pm only the group leader will submit on Moodle:

  1. The final_project.Rmd R Markdown file. This file must knit.
  2. The final_project.html HTML output file

Extension requests count against all group members. For example, say Avon, Stringer, and Marlo are in a group. If Marlo fills out the extension request form requesting 2 days, then 2 days will deducted from all three members’ 5 days extension request.


3. Project initial submission

To Do

  1. Update the final_project.Rmd you submitted last time
    1. Update the template code for the Multiple Regression section with the template code below. i.e. Delete all lines starting with # Multiple linear regression up to the line before # Discussion and replace it with the code below.
    2. Make changes based on the video feedback you received.
    3. Complete Sections 3 through 3.2 i.e. up to and including “Interpreting regression coefficients.” You will complete Sections 3.3, 3.4, and 4 during the next submission stage.
    4. Only group leader will make single submission on behalf of whole group
  2. Ask questions:
    1. For general project questions that would apply to all groups: ask in final_project channel in Slack
    2. For questions specific to your group: ask in group Slack DM you created in project proposal phase
  3. Tips:
    1. Knit early, knit often
    2. For tips on how to make your HTML document look professional, consult the R Markdown tips listed in RStudio Menu Bar -> Help -> Markdown Quick Reference

What to submit

By Thu 4/7 9pm only the group leader will submit on Moodle:

  1. The updated final_project.Rmd R Markdown file. This file must knit not only for you, but also for other people trying to replicate your analysis. This is known as reproducible research.
  2. The updated final_project.html HTML output file

Extension requests count against all group members. For example, say Bodie, Wallace, and Poot are in a group. If Wallace fills out the extension request form requesting 2 days, then 2 days will deducted from all three members’ 5 days extension request.


4. Project resubmission

Here is a rough example (project requirements have changed so do not interpret this example too literally).

To Do

  1. Update the final_project.Rmd you submitted last time
    1. Make changes based on the video feedback you received.
    2. Complete all remaining sections of the project: 3.3 until the end.
    3. Only group leader will make single submission on behalf of whole group
  2. Ask questions:
    1. For general project questions that would apply to all groups: ask in final_project channel in Slack
    2. For questions specific to your group: ask in group Slack DM you created in project proposal phase
  3. Tips:
    1. Knit early, knit often
    2. For tips on how to make your HTML document look professional, consult the R Markdown tips listed in RStudio Menu Bar -> Help -> Markdown Quick Reference

What to submit

By Fri 5/6 2pm, do the following two things. Since college rules state no work can be accepted after this time, there are no extensions.

  1. Only the group leader will submit on Moodle:
    1. The updated final_project.Rmd R Markdown file. This file must knit not only for you, but also for other people trying to replicate your analysis. This is known as reproducible research.
    2. The updated final_project.html HTML output file
  2. All group members must complete the following Exit Survey and Peer Evaluation Google Form