class: center, middle, inverse, title-slide .title[ # S3 ] .subtitle[ ## Mini-Lecture 05 ] .author[ ### Albert Y. Kim ] .date[ ###
SDS 270
2022-09-20
] --- class: center, inverse, middle # Slack review --- ## Errors > Subsetting Lab Q4: I don't understand why `starwars[c(lgl, lgl), ]` gives us an error message. -- - I suspect this may be a change in the `tibble` package -- - It "works" if you coerce to a `data.frame`: ```r lgl <- starwars$eye_color == "blue" as.data.frame(starwars)[c(lgl, lgl), ] %>% pull(eye_color) ``` ``` ## [1] "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue" ## [11] "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue" NA ## [21] NA NA NA NA NA NA NA NA NA NA ## [31] NA NA NA NA NA NA NA NA ``` -- - [Recycling rules](https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Recycling-rules) --- ## Recycling > when are recycling rules helpful? For example, when taking `x[y]` and `x` and `y` are different lengths, when would you want it to recycle the shorter of the two to the length of the longer? -- ```r *students <- tibble( # generate random 990 numbers student_id = runif(24) * 1000000 + 990000000, # assign them to 8 groups of 3 group_id = 1:8 ) ``` ``` ## Error: ## ! Tibble columns must have compatible sizes. ## • Size 24: Existing data. ## • Size 8: Column `group_id`. ## ℹ Only values of size one are recycled. ``` -- ```r *students <- data.frame( student_id = runif(24) * 1000000 + 990000000, group_id = 1:8 ) students ``` ``` ## student_id group_id ## 1 990291707 1 ## 2 990789076 2 ## 3 990568497 3 ## 4 990778435 4 ## 5 990713233 5 ## 6 990669049 6 ## 7 990934710 7 ## 8 990506460 8 ## 9 990745060 1 ## 10 990838353 2 ## 11 990869075 3 ## 12 990193112 4 ## 13 990216332 5 ## 14 990650423 6 ## 15 990335166 7 ## 16 990507656 8 ## 17 990652839 1 ## 18 990965577 2 ## 19 990514661 3 ## 20 990061657 4 ## 21 990151016 5 ## 22 990635566 6 ## 23 990102960 7 ## 24 990772694 8 ``` --- class: center, inverse, middle # S3 --- ## Object Oriented Programming in R vs. Java - In Java: ```python # Class object = new Constructor(args); Rectangle r = new Rectangle(0,0,5,5); # object.method(arg1, arg2); r.setSize(10, 15); ``` -- - In R using S3: ```r # object <- constructor(args) r <- rectangle(0,0,5,5) # generic(object, arg1, arg2) set_size(r, 10, 15) ``` -- ```r # what really happens: # generic.method(object, arg1, arg2) set_size.rectangle(r, 10, 15) ``` --- ## Defining a method ```r library(sloop) sloop::is_s3_generic("summary") ``` ``` ## [1] TRUE ``` -- ```r args("summary") ``` ``` ## function (object, ...) ## NULL ``` --- ## Testing `summary()` function From SDS 201/220/291: Fit an `lm` linear model. This is an object of class `lm` ```r mod <- lm(mpg ~ disp, data = mtcars) class(mod) ``` ``` ## [1] "lm" ``` -- Run `summary()` generic, which calls `summary.lm()` automatically ```r summary(mod) ``` ``` ## ## Call: ## lm(formula = mpg ~ disp, data = mtcars) ## ## Residuals: ## Min 1Q Median 3Q Max ## -4.8922 -2.2022 -0.9631 1.6272 7.2305 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 29.599855 1.229720 24.070 < 2e-16 *** ## disp -0.041215 0.004712 -8.747 9.38e-10 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 3.251 on 30 degrees of freedom ## Multiple R-squared: 0.7183, Adjusted R-squared: 0.709 ## F-statistic: 76.51 on 1 and 30 DF, p-value: 9.38e-10 ``` --- ## Defining a new `summary()` function Create a new function to only output regression coefficients: ```r summary.my_lm <- function(object, ...) { message("Look at my amazing regression results!") object$coefficients } ``` -- Let's make a class hierarchy for `mod`: first `my_lm`, then original class `lm` ```r class(mod) <- c("my_lm", "lm") class(mod) ``` ``` ## [1] "my_lm" "lm" ``` -- ```r summary(mod) ``` ``` ## Look at my amazing regression results! ``` ``` ## (Intercept) disp ## 29.59985476 -0.04121512 ``` --- ## `NextMethod()` to inherit classes Change this new function to first output the message, but also inherit the `lm` class ```r summary.my_lm <- function(object, ...) { message("Look at my amazing regression results!") NextMethod() } ``` -- ```r summary(mod) ``` ``` ## Look at my amazing regression results! ``` ``` ## ## Call: ## lm(formula = mpg ~ disp, data = mtcars) ## ## Residuals: ## Min 1Q Median 3Q Max ## -4.8922 -2.2022 -0.9631 1.6272 7.2305 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 29.599855 1.229720 24.070 < 2e-16 *** ## disp -0.041215 0.004712 -8.747 9.38e-10 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 3.251 on 30 degrees of freedom ## Multiple R-squared: 0.7183, Adjusted R-squared: 0.709 ## F-statistic: 76.51 on 1 and 30 DF, p-value: 9.38e-10 ``` --- ## Defining a new generic method ```r rmse <- function(x, ...) { UseMethod("rmse") } ``` -- ```r rmse.lm <- function(x, ...) { sqrt(mean(x$residuals^2)) } ``` -- ```r rmse(mod) ``` ``` ## [1] 3.148207 ``` -- - Now define methods for other model classes (e.g., `rmse.glm()`, etc.)