Who | Variables | Threshold | Test % Correct |
---|---|---|---|
Alden | job2 + height + diet3 + income2 |
0.5 | 84.1 |
Amanda | income_level + job_new + body_type_buckets |
0.5 | 70.7 |
Brenda | income+height |
0.5 | 83.2 |
James | income + job + orientation + body_type |
0.5 | 71.4 |
Albert | height |
0.5 | 83.0 |
Recall from Lec12.R
and quiz
Important Principle of Coding: Don’t Repeat Yourself
# 1. Get gold & bitcoin data
# 2. Make column structure the same
# 3. Add variable type
gold <- Quandl("BUNDESBANK/BBK01_WT5511") %>%
select(Date, Value) %>%
mutate(type="Gold")
bitcoin <- Quandl("BAVERAGE/USD") %>%
rename(Value = `24h Average`) %>%
select(Date, Value) %>%
mutate(type="Bitcoin")
# Combine them into single data frame using bind_rows()
combined <- bind_rows(gold, bitcoin) %>%
# Group by here!
group_by(type) %>%
# Then do the following ONLY ONCE:
filter(year(Date) >= 2011) %>%
arrange(Date) %>%
mutate(
Value_yest = lag(Value),
rel_diff = 100 * (Value-Value_yest)/Value_yest
)
# Plot
ggplot(combined, aes(x=Date, y=rel_diff, col=type)) +
geom_line() +
labs(y="% Change")
When parsing the time, what most of you did:
jukebox_hourly <- jukebox %>%
mutate(
date_time = parse_date_time(date_time, "%b %d %H%M%S %Y"),
hour=hour(date_time)
) %>%
group_by(hour) %>%
summarise(count=n())
What’s wrong with this plot?
ggplot(data=jukebox_hourly, aes(x=hour, y=count)) +
geom_bar(stat="identity") +
xlab("Hour of day") +
ylab("Number of songs played")
Need to change the time zone using date_time = with_tz(date_time, tz = "America/Los_Angeles")
:
jukebox_hourly <- jukebox %>%
mutate(
date_time = parse_date_time(date_time, "%b %d %H%M%S %Y"),
date_time = with_tz(date_time, tz = "America/Los_Angeles"),
hour=hour(date_time)
) %>%
group_by(hour) %>%
summarise(count=n())
ggplot(data=jukebox_hourly, aes(x=hour, y=count)) +
geom_bar(stat="identity") +
xlab("Hour of day") +
ylab("Number of songs played")
Your top 10 graveyard shift artists are not
artist | n |
---|---|
OutKast | 2533 |
Beatles, The | 2219 |
Led Zeppelin | 1617 |
Radiohead | 1589 |
Rolling Stones, The | 1522 |
Notorious B.I.G. | 1382 |
Red Hot Chili Peppers, The | 1318 |
Eminem | 1311 |
Bob Dylan | 1163 |
Talking Heads | 1113 |
but are instead below. Not that different actually!
artist | n |
---|---|
OutKast | 1217 |
Beatles, The | 765 |
Notorious B.I.G. | 719 |
Led Zeppelin | 644 |
Eminem | 616 |
2Pac | 568 |
Rolling Stones, The | 529 |
Radiohead | 490 |
Talking Heads | 448 |
Tenacious D | 435 |