Load the familiar data again, removing individuals with no listed height
library(mosaic)
library(dplyr)
library(ggplot2)
library(okcupiddata)
data(profiles)
profiles <- profiles %>%
filter(!is.na(height))
resample()
function, take a single sample (without replacement) of size \(n=50\) of 50 OkCupid users’ heights profiles$height
. Assign this to an object sample_50
.mean()
, sd()
, and sqrt()
functions, compute one confidence interval for mu.set.seed(76)
sample_50 <- resample(profiles$height, size=50, replace = FALSE)
xbar <- mean(sample_50)
s <- sd(sample_50)
n <- length(sample_50)
c(xbar, s, n)
## [1] 68.340000 4.688719 50.000000
c(xbar -2*s/sqrt(n), xbar +2*s/sqrt(n))
## [1] 67.01383 69.66617
Our 95% CI is \[ \left(\overline{x} - 2 SE, \overline{x} + 2 SE\right) =\left(\overline{x} - 2 \frac{s}{\sqrt{n}}, \overline{x} + 2 \frac{s}{\sqrt{n}}\right) = \left(67.01, 69.67\right) \]
Back to theoretical/rhetorical land Let’s repeat the following procedure 100 times:
Here are the (random) results:
Our net missed the fish 3 times! On average, it will miss it 5% of the time.
So the correct interpretation of our original 95% confidence interval (67.01, 69.67)