# Load necessary packages
library(ggplot2)
library(dplyr)
library(nycflights13)
# Load flights data set in nycflights
data(flights)
Create a new data frame that shows the top 5 airports with the largest arrival delays from NYC in 2013.
flights %>%
group_by(dest) %>%
summarize(largest_arrival_delay = max(arr_delay, na.rm=TRUE)) %>%
top_n(n = 5) %>%
arrange(desc(largest_arrival_delay))
## # A tibble: 5 × 2
## dest largest_arrival_delay
## <chr> <dbl>
## 1 HNL 1272
## 2 CMH 1127
## 3 ORD 1109
## 4 SFO 1007
## 5 CVG 989
1272 minutes = 21.2 hour delay for a flight to Honolulu! So on top of the long, long flight, you arrive nearly a day late!
What happens when you try to left_join
the ten_freq_dests
data frame with airports
instead of airports_small
? How might one use this result to answer further questions about the top 10 destinations?
We first define the necessary data frames
airports_small <- airports %>%
select(faa, name)
ten_freq_dests <- flights %>%
group_by(dest) %>%
summarize(num_flights = n()) %>%
top_n(n = 10) %>%
arrange(desc(num_flights))
We compare the two possible joins:
orig_join <- ten_freq_dests %>%
left_join(airports_small, by = c("dest" = "faa"))
new_join <- ten_freq_dests %>%
left_join(airports, by = c("dest" = "faa"))
We then do a View()
of both:
View(orig_join)
View(new_join)
The latter profiles more information. For example, most of the top 10 destinations have tz=-5
. Looking at ?airports
, we see that tz
corresponds to time zone. 7 of the top 10 destinations are in the Eastern time zone, with two more being in Pacific.
What surprises you about the top 10 destinations from NYC in 2013?
ten_freq_dests %>%
left_join(airports_small, by = c("dest" = "faa"))
## # A tibble: 10 × 3
## dest num_flights name
## <chr> <int> <chr>
## 1 ORD 17283 Chicago Ohare Intl
## 2 ATL 17215 Hartsfield Jackson Atlanta Intl
## 3 LAX 16174 Los Angeles Intl
## 4 BOS 15508 General Edward Lawrence Logan Intl
## 5 MCO 14082 Orlando Intl
## 6 CLT 14064 Charlotte Douglas Intl
## 7 SFO 13331 San Francisco Intl
## 8 FLL 12055 Fort Lauderdale Hollywood Intl
## 9 MIA 11728 Miami Intl
## 10 DCA 9705 Ronald Reagan Washington Natl
Different people will have different answers, but I’m wondering: are that many people flying to Boston from NYC?