Mon Oct 24, 2016

## Format

• Wed Oct 23, 7:30pm-10:00pm, in Warner 506.
• I'm going to try to target it so the median completion time is about 1h15m
• Closed book, no calculators, but you may bring dplyr cheatsheet.
• You won't need to write 100% correct R code, but rather rough pseudocode

## Pseudocode & Algorithms

• Pseudocode is informal and rough code that doesn't necessarily need to work, but still illustrates each step of your algorithm.
• An algorithm is just a computer recipe: a process or set of rules to be followed in calculations or other problem-solving operations.
• Example

## Sources

• Lectures 01 through 16 inclusive
• Read the slides from each lecture to get the executive summary
• Corresponding textbook material
• Learning check discussions
• Problem Sets!

## Sources: Problem Sets

• Problem Sets 01-05: Go through them all. You are now in a position to understand all data manipulations.
• Problem Set 06 as practice for data manipulation as it is a synthesis of all data manipulation tools we've seen.
• Instructions:
• Separate out what you are going to do from how you are going to do it. i.e. set up a plan
• I highly recommend you work in groups for this, especially the brainstorming stage.

## Data Visualization

• For any kind of situation/data, be able to identify which of the 5NG is most appropriate to convey the information contained in the variables
• For each of the 5NG understand the Grammar of Graphics: data, aes(VARIABLE_NAME), geom_WHATEVER
• How can faceting help?
• Be able to both:
• A: Forward engineer graphs: I give you tidy data, you write out a rough ggplot() call and/or draw the graph
• B: Reverse engineer graphs: I give you the graph, you write out a rough ggplot() call and/or the tidy data

## Data Manipulation

• Understand the 5MV + joins. The images on the dplyr cheatsheet illustrate these well.
• IMO the best way to study these to learn by doing.
• Go over examples of data manipulation in the learning checks, the textbook, and Problem Sets and see if you can reconstruct them on your own.
• If you can get them working in R, then you're definitely able to write the pseudocode.