Fri Sep 23, 2016

Data Visualization

  • We are building up to doing data visualization in R via the ggplot2 package
  • Last time we reverse-engineered the grammar from graphical outputs
  • Today we (forward) engineer them

Refresher: The Grammar of Graphics

A statistical graphic is a mapping of data variables to aes()thetic attributes of geom_etric objects.

Refresher: 5NG

The five named graphs we'll see in this class

  1. Scatterplot AKA bivariate plot
  2. Line-graph
  3. Boxplot
  4. Barplot AKA Barchart AKA bargraph
  5. Histogram

Data

Consider the following data in tidy format:

A B C D
1 1 3 a
2 2 2 a
3 3 1 b
4 4 2 b

Learning Check

Draw the 5 graphics below, where the x-axis is variable A, the y-axis is variable B, and

  1. A scatter plot
  2. A scatter plot where the color of the points corresponds to D
  3. A scatter plot where the size of the points corresponds to C
  4. A line graph
  5. A line graph where the color of the line corresponds to D