Mon Nov 16, 2016

Today

More answering questions with data: specifically final exam scores from when I taught intro stats at Reed College.

Concept 1: EDA

  • Before using any statistical tool, always do an exploratory data analysis (EDA) to look at your data.
  • You might not even need statistics.

Concept 2: Wrapper Function

  • Wikipedia: A wrapper function is computer code whose main purpose is to call more complex computer code with little or no additional work.
  • The wrapper exists simply to "hide" more complex code and/or to simplify its use.

Concept 3: Setting Seed Value

A computer can't generate pure random numbers, but pseudorandom results:

  • Takes a seed value and returns numbers that are indistinguishable from true randomness
  • If you don't give a seed, the computer picks one based on how much disk space + clock
  • If you do give the same seed, you'll get replicable pseudorandom results.

Run the Following in Console

library(mosaic)
basket <- c("apple", "orange", "mango", "pear")

# Seed is set automatically by computer
shuffle(basket)

# Set seed
set.seed(76)
shuffle(basket)

# Set seed to same value to get replicable pseudorandom results.
set.seed(76)
shuffle(basket)

Setting Seed Value

My favorite seed value is 76.