Problem set 07

Assigned Friday 11/25, due Friday 12/6 at 10:45am on GitHub Classroom.

  1. Start this problem set early! That way you will have time to get help at the Spinelli Center tutoring hours if you run into trouble.
  2. I will not answer your questions until you’ve been to the Spinelli Center Sun-Thurs 7-9pm tutoring hours first.
    • It will be very difficult for me to help 68 students individually. Please help me spread the workload. If you’re stuck, please get help at the Spinelli Center first: that way they can handle the easy questions, leaving only the harder ones for me. I will be asking you if you’ve already gone to the Spinelli Center first before answering any of your questions.
    • Note that tutoring from 6pm on Tuesday November 26th through noon on Sunday December 1st is cancelled.
  3. GitHub frustrations are normal. Git and GitHub are extremely powerful tools, but they can be very frustrating at times. This is normal. Check out the title of this zine made by Julia Evans for advanced users.

1. GitHub component

Note: I sent an email on Saturday inviting you to the Fall 2019 SDS192 Intro to Data Science GitHub Classroom Organization. Please accept this invitation.

For this PS you will be using the “Happy Git and GitHub for the useR” book to:

  1. Get familiar with GitHub. Read:
    • Chapter 1: Why Git? Why GitHub?
  2. Install GitHub on your computers and set it up. Read and perform all steps in:
    • Chapter 4: Register a GitHub account You’ve already done this
    • Chapter 5: Install or upgrade R and RStudio You’ve already done this
    • Chatper 6: Install Git
    • Chapter 7: Introduce yourself to Git
  3. Locate the GitHub repository you created in Lec32.
    • You should be able to see it here and it should be named something like learning-github-YOUR_GITHUB_USER_NAME. So for example, my repo would be called learning-github-rudeboybert.
    • If you don’t see it, accept this invitation to receive one.
  4. “Clone” the above repo into RStudio on your computer. Read and perform all steps in:
    • Chapter 12: Connect RStudio to Git and GitHub. For Section 12.2 ignore what’s in the book and do the following:
      1. You will not “Make a repo on GitHub,” but rather use the learning-github-YOUR_GITHUB_USER_NAME repo from the previous step.
      2. Click the big green button “Create repository.”
      3. Copy the HTTPS clone URL to your clipboard via the green “Clone or Download” button.
  5. You do not need to submit anything. Completing Chapter 12 will automatically update your submission on GitHub classroom.

2. Hint

How do I know if I’ve successfully completed the assignment?

  1. Go to the SDS192 Intro to Data Science GitHub organization and make sure you see your learning-github-YOUR_GITHUB_USER_NAME folder.
  2. Then go to your learning-github-YOUR_GITHUB_USER_NAME folder and then look at your README. There should be the line “This is a line from RStudio” on it.

Problem set 06

Assigned Monday 11/11. Two components:

  • R component. A .zip file of your PS06 RStudio Project folder. Due on Friday 11/15 at 10:45am on Moodle.
  • Quiz. In-class on Monday 11/18. To do: Read the following Buzzfeed.com article on DataCamp (an online platform for learning data science). We’ll be having a brief in-class ethics discussion based on this article. Added Sun 11/17: For context, DataCamp is an online and interactive platform for learning data science, featuring courses in R, python, SQL, and git.

1. R Markdown component

Learning goals:

  • Get experience with maps by reproducing the two plots shown in this example.
  • Understand absolute vs relative file paths
  • Start creating work that’s reproducible

To do:

  • Download the following template .Rmd file RStudio Project Folder: PS06.zip.
  • Be sure to be in RStudio Project mode
  • Knit it once and read over the instructions.
  • Complete the problem set.
  • Submit a .zip file of all the contents of your PS06 RStudio Project folder on Moodle.

Hints:

  1. If your R Markdown file won’t “knit”, go through these 6 R Markdown Fixes first, then seek assistance. In my experience, these 6 fixes resolve 85% of issues.

2. Solutions & (imperfect) rubric

Solution:

Copy this PS06_solutions.Rmd R Markdown solutions file and put it in your PS06 RStudio Project folder.

Rubric:

  1. Quiz: 2 pts
  2. R component:
    • Is your PS06.Rmd reproducible by the graders? 4 pts (we will be strict on this)
    • Are your two graphs identical to ones in the example? 6 pts

Total: 12pts


Problem set 05

Assigned Friday 10/18, due Friday 10/25 at 10:45am on Moodle.

1. R Markdown component

  • Read solutions to previous PS: PS04_solutions.html html report.
  • Download the following Rmd template file: PS05.Rmd.
  • Knit it once and read over the instructions.
  • Complete the problem set.
  • Submit the resulting PS05.html file on Moodle. Note: you are only submitting the output HTML report file and not the original R Markdown file.

Hints:

  1. If your R Markdown file won’t “knit”, go through these 6 R Markdown Fixes first, then seek assistance. In my experience, these 6 fixes resolve 85% of issues.

2. Solutions & (imperfect) rubric

Solution:

Rubric:

As stated in lecture, you must always labels your axes and add a title. This makes the context of the data easy to understand for the reader.

  1. Q1: 3 + 1 + (3 + 2 + 1) = 10pts
  2. Q2: 2 + 2 (+ 2 Bonus) = 4pts

Total: 14pts


Problem set 04

Assigned Friday 10/11, due Friday 10/18 at 10:45am on Moodle.

1. R Markdown component

  • Read solutions to previous PS: PS03_solutions.html html report.
  • Download the following Rmd template file: PS04.Rmd.
  • Knit it once and read over the instructions.
  • Complete the problem set.
  • Submit the resulting PS04.html file on Moodle. Note: you are only submitting the output HTML report file and not the original R Markdown file.

Hints:

  1. If your R Markdown file won’t “knit”, go through these 6 R Markdown Fixes first, then seek assistance. In my experience, these 6 fixes resolve 85% of issues.

2. Solutions & (imperfect) rubric

Solution:

Rubric:

  1. Q1: 2 + 5 + 2 = 9pts
  2. Bonus: 1 + 1 = 2pts

Total: 9pts


Problem set 03

Assigned Friday 9/20, due Friday 9/27 at 10:45am on Moodle.

1. R Markdown component

  • Download the following Rmd template file: PS03.Rmd.
  • Complete the problem set.
  • Submit the resulting PS03.html file on Moodle. Note: you are only submitting the output HTML report file and not the original R Markdown file.

Hints:

  1. If your R Markdown file won’t “knit”, go through these 6 R Markdown Fixes first, then seek assistance. In my experience, these 6 fixes resolve 85% of issues.

2. Podcast quiz component

Listen to the Not So Standard Deviations podcast episode 71 “Compromised Shoe Situation” (1h4m) available here. In class on Friday 9/27, you’ll be giving a brief quiz on this podcast.

Notes:

  1. The quiz will not be difficult. As long as you listen to the podcast once, you’ll be fine.
  2. IMO you don’t need to take notes. Listen while folding laundry or exercising!
  3. This podcast relates to “data collection”, which will be a theme for Mini-Project 1.

3. Solutions & (imperfect) rubric

Solution:

Rubric:

  1. Q1: 1 + 4 + 1 = 6pts
  2. Q2: 1 + 2 + 1 + 3 + 1 = 8pts
  3. Podcast quiz: 1pt

Problem set 02

Assigned Friday 9/13, due Friday 9/20 at 10:45am on Moodle.

1. R Markdown component

  • Download the following Rmd template file: PS02.Rmd.
  • Upload the PS02.Rmd file to RStudio Cloud.
  • Complete the problem set.
  • Submit the resulting PS02.html file on Moodle. Note: you are only submitting the output HTML report file PS02.html and not the original PS02.Rmd R Markdown file.

Hints:

  1. The resulting graph should be similar to the one here.
  2. For this problem set, I suggest you not try to write code from scratch, but rather take the copy/paste/tweak approach: find code that does something similar to what you want, copy it, paste it, and change it just enough so that it does what you want. As you get more comfortable, I suggest slowly trying to write code from scratch.
  3. Here is the screencast demo I recorded in class on the problem set workflow:


2. Solutions & (imperfect) rubric

Solutions:

Rubric:

  1. Q1: 4pts (one for each of the four aesthetics)
  2. Q2: 1pt

Problem set 01

Assigned Monday 9/9, due Friday 9/13.

  1. Complete the following confidential Pre-Course Questionnaire online for Prof. Baumer, Prof. Crouser, and my Smith College IRB-approved study.
  2. If you don’t already have one, create an account on GitHub using your smith.edu (or Five College) email address.
  3. Complete the following intro survey online with info about you. This survey is not part of the above study and is used by me to get to know you better.
  4. Complete a syllabus quiz.
    • Print out the following Google Doc.
    • Answer the questions based on information in the syllabus.
    • Submit your print out in class on Friday. If you can’t make it, have one of your peers submit it for you.