Mon Oct 24, 2016
Format
- Wed Oct 23, 7:30pm-10:00pm, in Warner 506.
- I'm going to try to target it so the median completion time is about 1h15m
- Closed book, no calculators, but you may bring
dplyr
cheatsheet.
- You won't need to write 100% correct R code, but rather rough pseudocode
Pseudocode & Algorithms
- Pseudocode is informal and rough code that doesn't necessarily need to work, but still illustrates each step of your algorithm.
- An algorithm is just a computer recipe: a process or set of rules to be followed in calculations or other problem-solving operations.
- Example
Sources
- Lectures 01 through 16 inclusive
- Read the slides from each lecture to get the executive summary
- Corresponding textbook material
- Learning check discussions
- Problem Sets!
Sources: Problem Sets
- Problem Sets 01-05: Go through them all. You are now in a position to understand all data manipulations.
- Problem Set 06 as practice for data manipulation as it is a synthesis of all data manipulation tools we've seen.
- Instructions:
- Separate out what you are going to do from how you are going to do it. i.e. set up a plan
- I highly recommend you work in groups for this, especially the brainstorming stage.
Data Visualization
- For any kind of situation/data, be able to identify which of the 5NG is most appropriate to convey the information contained in the variables
- For each of the 5NG understand the Grammar of Graphics:
data
, aes(VARIABLE_NAME)
, geom_WHATEVER
- How can faceting help?
- Be able to both:
- A: Forward engineer graphs: I give you tidy data, you write out a rough
ggplot()
call and/or draw the graph
- B: Reverse engineer graphs: I give you the graph, you write out a rough
ggplot()
call and/or the tidy data
Data Manipulation
- Understand the 5MV + joins. The images on the
dplyr
cheatsheet illustrate these well.
- IMO the best way to study these to learn by doing.
- Go over examples of data manipulation in the learning checks, the textbook, and Problem Sets and see if you can reconstruct them on your own.
- If you can get them working in R, then you're definitely able to write the pseudocode.