Wed Sep 21, 2016

What is a statistical graphic?

Today we kick off Topic 2.b) Data Visualization by asking ourselves

What is a statistical graphic?

But a brief lesson from military history first:

Napoleon's March on Russia in 1812

In 1812, Napoleon led a French invasion of Russia, marching on Moscow.

Drawing

Napoleon's March on Russia in 1812

It was one of the biggest military disasters ever, in particular b/c of the Russian winter.

Drawing

Minard's Illustration of the March

Famous graphical illustration of Napolean's march to/from Moscow

Drawing

Minard's Illustration of the March

This was considered a revolution in statistical graphics because between

  • the map on top
  • the line graph on the bottom

there are 6 dimensions of information (i.e. variables) being displayed on a 2D page.

The Grammar of Graphics

A statistical graphic is a mapping of data variables to aes()thetic attributes of geom_etric objects.

Minard's Illustration of the March

Where? data aes() geom_
top map longitude x point
" latitude y point
" army size size path
" army direction (forward vs retreat) color path
bottom graph date x line & text
" temperature y line & text

Grammar of Graphics

Wilkinson (2005) laid out the proposed "Grammar of Graphics"

Grammar of Graphics

Wickham implemented the grammar in R in ggplot2 package

Edward Tufte

Another seminal book is Tufte's "The Visual Display of Quantitative Information"

Types of Graphs

Name this graph type!

Types of Graphs

From ggplot2movies package, the movies data set:

Types of Graphs

From nycflights13 package, the flights data set:

Types of Graphs

From fueleconomy package, the vehicles data set:

Types of Graphs

From babynames package, the babynames data set:

Types of Graphs

From okcupiddata package, the profiles data set:

5NG

Say hello to the 5NG: the five named graphs

  1. Scatterplot AKA bivariate plot
  2. Line-graph
  3. Boxplot
  4. Barplot AKA Barchart AKA bargraph
  5. Histogram