female_audreys <- filter(babynames, name=="Audrey" & sex=="F")
ggplot(data=female_audreys, aes(x=year, y=prop)) +
geom_line() +
geom_smooth(se=FALSE, span=0.1)
span=10
in the last code block above. Does this appear to be a good smoother?In my opinion, span=10
is a bit too coarse:
ggplot(data=female_audreys, aes(x=year, y=prop)) +
geom_line() +
geom_smooth(se=FALSE, span=10)
But maybe span=0.1
is TOO refined, i.e. not enough smoothing is happening. What about span=1
:
ggplot(data=female_audreys, aes(x=year, y=prop)) +
geom_line() +
geom_smooth(se=FALSE, span=1)
Do you think the earlier smoother or the regression line is a better way to pick out the “signal” (i.e. the trend) from the “noise” in the previous plot? Using this evidence, what do you think is a condition for a regression line to have “valid” interpretability?
Hard to say.