Competing Hypotheses

Recall the two competing hypotheses:

  1. She truly has the ability to tell which came first: milk or tea.
  2. She is just guessing.

Key: Let’s suppose 2 is true; suppose she is guessing.

Learning Check

  1. See if using the resample() and do() commands you can
    • simulate many, many, many cases of someone guessing at random for the eight cups, then
    • count the number of correct guesses
  2. How would you compare
    • the observed number of correct guesses (in this case 8)
    • to the typical number of correct guesses assuming she is guessing at random?

Solution

1. Simulating someone guessing at random for 8 cups many, many, many times

library(ggplot2)
library(dplyr)
library(mosaic)

# Single cup
guess_cup <- c(1, 0)

# Simulate many, many, many times someone guessing at random
simulation <- do(10000) * resample(guess_cup, size=8)
V1 V2 V3 V4 V5 V6 V7 V8
0 0 0 1 1 1 0 0
0 0 0 1 1 1 1 1
1 0 0 1 1 0 1 0
1 1 1 1 1 0 1 0
1 0 0 0 0 0 0 1
1 0 1 0 0 1 1 1

2. Count the number of correct guesses

simulation <- simulation %>% 
  mutate(n_correct = V1 + V2 + V3 + V4 + V5 + V6 + V7 + V8) 
V1 V2 V3 V4 V5 V6 V7 V8 n_correct
0 0 0 1 1 1 0 0 3
0 0 0 1 1 1 1 1 5
1 0 0 1 1 0 1 0 4
1 1 1 1 1 0 1 0 6
1 0 0 0 0 0 0 1 2
1 0 1 0 0 1 1 1 5

3. What is typical number of correct guesses assuming she is guessing at random?

i.e. What is the distribution of n_correct?

ggplot(simulation, aes(x=n_correct)) + 
  geom_bar() +
  labs(x="Number of Guesses Correct")

4. Compare to observed number correct

i.e. she got 8/8 correct. We add a vertical red line:

ggplot(simulation, aes(x=n_correct)) + 
  geom_bar() +
  labs(x="Number of Guesses Correct") +
  geom_vline(xintercept=8, col="red", size=2) 

Questions

  1. If she were truly guessing at random, how likely is it that she gets 8/8 correct?
  2. What does the observed data say about the hypothesis that she was guessing at random?