Data visualization (pink): Grammar of Graphics, Five Named Graphs (5NG), color theory.
Working with data (blue): data wrangling, importing, and formatting
Maps and spatial data (green): Maps and geospatial data.
Learning how to learn new data science tools (yellow): SQL, TBD.
Note that while topics and topics dates may change, all problem sets (PS), project, and midterm dates will not.
Lec 06: Fri 9/17
Announcements
Oh snap! @SmithCollegeSDS is on a hiring spree! Put 👀 on these 3⃣ tenure track positions, apps due:
- 10/8 Biostatistics, statistics, or related - 10/15 joint hire with the Math dept - 10/22 candidates with a Ph.D. in stats, CS, information sciences, math, or related https://t.co/RCmtlSzg3S
Histograms for visualizing the distribution of a numerical variable
Section 1 (Stoddard G6) Demo
Section 2 (Sabin-Reed 220) Demo
2. In-class exercise
If you still haven’t been able to “Knit to PDF”, please ask for help
Go over ModernDive reading in schedule above.
Lec 05: Wed 9/15
Announcements
PS02 was posted after Monday’s lecture.
Today’s topics/activities
1. Chalk talk
Overplotting and two approaches for addressing it
Linegraphs
2. In-class exercise
Explore the different formatting tools in R Markdown: go to RStudio top menu bar -> Help -> Markdown quick reference.
Sec01 in Stoddard: There was an typo in Step 8 in last lecture’s in-class exercise. If you weren’t able to Knit directly to PDF, please re-attempt Steps 8-9. Knitting directly to PDF, instead of Knitting to Word and then saving to PDF, is the preferred submission format for all problem sets. It will be less hassle for you and provide consistency for the graders.
Go over ModernDive reading in schedule above.
Lec 04: Mon 9/13
Announcements
Problem Set 02 due next Monday 5pm, now posted under Problem Sets
Today’s topics/activities
1. Chalk talk
Recap of previous lecture
“Where can I save all the code I run in class?” In an R Markdown .Rmd file; R Markdown is a tool for reproducible research
Input: An .Rmd file
Output: An .html, .docx, or .pdf file.
2. In-class exercise
In-class battle-testing and practicing for PS02:
At a couple of steps in this process, you will be asked to install packages. Say yes to all of them.
If at any point your code won’t knit, go through these 6 R Markdown Fixes first, then seek assistance. These 6 fixes will resolve 85% of issues.
Create new R Markdown .Rmd file:
Go to RStudio menu bar -> File -> New File -> R Markdown
Set “Title” to “My first R Markdown report” and “Author” as your name.
Save this file as testing somewhere on your computer. This will create a file called testing.Rmd
Method 1: “Knit” a report to HTML:
Click the arrow next to “Knit” -> “Knit to HTML”.
An HTML webpage should pop up. However, it may be blocked by your browser. If so, in your browser’s URL bar, click on “Always allow pop-ups”.
Method 1: Publish HTML report on web:
Click on blue “Publish” button on top right of the resulting pop-up html.
Select RPubs.
If you haven’t previously, create an account on Rpubs.com. If you have previously, login.
Set “Title” to “My first R Markdown report” and “Slug” to “testing”
You should end up with a webpage that looks like this one. This is live on the web!
Method 1: Update HTML report on web:
Make some trivial change to your testing.Rmd file.
“Re-knit” your report and make sure your trivial change is reflected.
The blue “Publish” button should now read “Republish”
Click “Update existing”
Your updates are now live on the web!
Method 2: “Knit” a report to Word
Click the arrow next to “Knit” -> “Knit to Word”.
Save the resulting Word document as a pdf file.
Only if you are a macOS user:
Next to “Console” go to “Terminal”
Run this line of code:
sudo chown -R `whoami`:admin /usr/local/bin
Enter your password. Note: Terminal has weird behavior whereby as you enter your password, the cursor will not move. Don’t worry your password is registering.
Sunday 9/19 at 11:50AM: Opportunities in Statistics & Data Science in Academia, Government, & Non-Profit featuring SDS’s Prof. Randi Garcia!
Keynote address by Robert Santos, 116th President of the ASA, and President Biden’s nominee to serve as Director of the United States Census Bureau! If approved by the Senate, he would be the first Latinx Director of the Bureau!
Today’s topics/activities
1. Chalk talk
Recap of previous lecture
Grammar of Graphics
5NG1: Scatterplots
Next time:
Question: Do I need to re-type my code in the Console every single time?
Problem Set 01 due this Monday 5pm, posted under Problem Sets.
Today’s topics/activities
1. Chalk talk
Intro to Slack
What is difference between R and RStudio?
What are R packages?
2. In-class exercise
Go over ModernDive reading in schedule above.
About readings in this course:
You are responsible for completing a lecture’s readings before the next lecture. Ex: you are responsible to read all of ModernDive Chapter 1 before Wednesday.
I teach lectures assuming you have not done the readings beforehand. However, if it suits your learning style better, please do read beforehand.
While you don’t need to turn in your learning check answers, I highly recommend you still do them. The solutions are in Appendix D of the book.
If you have your headphones, you may listen to music during in-class reading time.
Lec 01: Fri 9/3
Announcements
Welcome!
Today’s topics/activities
Course webpage: bit.ly/sds192kim
My story
“Knock on wood if you’re with me”
What this class is about: Answering questions with data