# Load necessary packages
library(ggplot2)
library(dplyr)
library(nycflights13)

# Load flights data set in nycflights
data(flights)

## LC 5.12

Create a new data frame that shows the top 5 airports with the largest arrival delays from NYC in 2013.

#### Solution

flights %>%
group_by(dest) %>%
summarize(largest_arrival_delay = max(arr_delay, na.rm=TRUE)) %>%
top_n(n = 5) %>%
arrange(desc(largest_arrival_delay))
## # A tibble: 5 × 2
##    dest largest_arrival_delay
##   <chr>                 <dbl>
## 1   HNL                  1272
## 2   CMH                  1127
## 3   ORD                  1109
## 4   SFO                  1007
## 5   CVG                   989

1272 minutes = 21.2 hour delay for a flight to Honolulu! So on top of the long, long flight, you arrive nearly a day late!

## LC 5.16

What happens when you try to left_join the ten_freq_dests data frame with airports instead of airports_small? How might one use this result to answer further questions about the top 10 destinations?

#### Solution

We first define the necessary data frames

airports_small <- airports %>%
select(faa, name)

ten_freq_dests <- flights %>%
group_by(dest) %>%
summarize(num_flights = n()) %>%
top_n(n = 10) %>%
arrange(desc(num_flights))

We compare the two possible joins:

orig_join <- ten_freq_dests %>%
left_join(airports_small, by = c("dest" = "faa"))
new_join <- ten_freq_dests %>%
left_join(airports, by = c("dest" = "faa"))

We then do a View() of both:

View(orig_join)
View(new_join)

The latter profiles more information. For example, most of the top 10 destinations have tz=-5. Looking at ?airports, we see that tz corresponds to time zone. 7 of the top 10 destinations are in the Eastern time zone, with two more being in Pacific.

## LC 5.17

What surprises you about the top 10 destinations from NYC in 2013?

#### Solution

ten_freq_dests %>%
left_join(airports_small, by = c("dest" = "faa"))
## # A tibble: 10 × 3
##     dest num_flights                               name
##    <chr>       <int>                              <chr>
## 1    ORD       17283                 Chicago Ohare Intl
## 2    ATL       17215    Hartsfield Jackson Atlanta Intl
## 3    LAX       16174                   Los Angeles Intl
## 4    BOS       15508 General Edward Lawrence Logan Intl
## 5    MCO       14082                       Orlando Intl
## 6    CLT       14064             Charlotte Douglas Intl
## 7    SFO       13331                 San Francisco Intl
## 8    FLL       12055     Fort Lauderdale Hollywood Intl
## 9    MIA       11728                         Miami Intl
## 10   DCA        9705      Ronald Reagan Washington Natl

Different people will have different answers, but I’m wondering: are that many people flying to Boston from NYC?