#questions
on Slack.regression_linear_algebra.Rmd
file I will distribute on Slack. Here you will study the effects of collinearity in regression (which you learned in SDS 291 Multiple Regression), but from a linear algebra perspective.Here is my plan for Midterm II. However, I am open to discussion and dialogue. If any of this does not work for you, your learning environment, or your living situation, please let me know on Slack ASAP.
PS7_solutions.Rmd
file I will distribute over Slack in your PS7
RStudio Project folder and knit it.PCA.Rmd
and sat.csv
from Slack and put them in your MassMutual RStudio Project and go over PCA.Rmd
None
#questions
channel on Slack at the following times:
random_forests.Rmd
file I’ve distributed over Slack into your MassMutual RStudio Project.mtry
does open in your browser.random_forests.Rmd
Midterm I has been graded and distributed.
LASSO.R
. Load all the package then go to around line 59 where you fit a LASSO model with glmnet(..., alpha = 1, ...)
.?glmnet
-> Arguments -> alpha.A “Previously on SDS293” episode recap in images:
A “Next time on SDS293” episode sneak peak in images:
README.md
ASAP. This is an excellent opportunity to practice remote collaboration using a combination of Slack, GitHub, and Zoom. These are professional skills that I have found invaluable for my career.CART.Rmd
and LASSO.Rmd
and look at the optimization formulas for both cases.#questions
Slack channel indicating where your question is referring to. For example: “Video for Lec22.a) 10:52 Shouldn’t that be y-hat = y-bar?”#general
. Please do not be shy with your criticisms; for remote lectures to work production values matter.LASSO.R
from Lec22, go over the remainder of the code: section 5, or roughly lines 183-295#questions
Slack channel indicating where your question is referring to. For example: “Lec22 LASSO.R
around line 142 where it says get_LASSO_coefficients()
what is that?”regression.Rmd
CART.Rmd
LASSO.Rmd
#questions
Slack channel indicating where your question is referring to. For example: “Video for Lec22.a) 10:52 Shouldn’t that be y-hat = y-bar?” (the answer is yes)#general
. Please do not be shy with your criticisms; for remote lectures to work production values matter. For example, I already know I need to work on
LASSO.R
on GitHubLASSO.R
in your MassMutual RStudio Project#questions
Slack channel indicating where your question is referring to. For example: “Lec22 LASSO.R
around line 142 where it says get_LASSO_coefficients()
what is that?”#general
messageROC.Rmd
-> ## Background
-> first code chuck where values
csv is loaded. Replace its contents with what I just shared on Slack.LASSO.Rmd
NA
ROC.Rmd
Shiny App.logistic_regression.Rmd
ROC.Rmd
Shiny appcp
value using cross-validation (we will be verifying that your PS4 submissions on GitHub are all before 9am today).logistic_regression.Rmd
ROC.Rmd
Shiny appNA
NA
coding.Rmd
-> # Wednesday, July 24 2019
-> ## Cross-validation
Recap of Lec10:
Today:
cp
values: cp=0
(relatively more complex tree) & cp=0.2
(relatively less complex tree)CART.Rmd
-> best slider value of \(\alpha\) = cp
complexity parameter?OverallQual
as a predictor, ended up with a negative prediction \(\widehat{y}\) = \(\hat{\text{SalePrice}}\)test
data using a model that is HELLA overfit to the training
data using cp = 0
.#general
#questions
channel. Ask all non-private questions here.train
and where do I use test
?CART.Rmd
-> Explain \(\widehat{p}_{mk}\), in particular how it plays into “Gini Index”.coding.Rmd
-> # Wednesday, July 24 2019
-> Demonstration of overfitting
model_CART_3
corresponds to a HELLA overfit model. i.e. it doesn’t generalize.NA
Today’s chalk talk on CART is based on the Tuesday PM topics in the MassMutual Google Doc.
Open the MassMutual RStudio Project -> CART.Rmd
Shiny app. (Note: after you install the necessary packages, this should knit.)
Today’s chalk talk on CART is based on the Tuesday PM topics in the MassMutual Google Doc.
Open the MassMutual RStudio Project:
First: Based on coding.Rmd
-> Tuesday -> iris
dataset. (Note: You might have to do a little debugging to get this to knit, like setting eval = TRUE
for all code blocks.)
mutate()
Exercise from Lec05 -> MassMutual RStudio Project -> Tuesday -> Exercise: Submit Kaggle predictions using linear regression model.
In-class exercise. The screencast is posted here.
README.md
, but write something different.The iris
dataset has historically been one of the most widely used datasets in statistics, first collected by Ronald A. Fisher. Type ?iris
in the console and look at “Source.” While Fisher has done a lot to advance the field, some of his views were IMO problematic.
What are classification and regression trees? Here is one example from the New York Times. Note: Smith students can get free access to the New York Times and Wall Street Journal via Smith Libraries.
#general
When building a product, in my opinion (IMO):
Once you’re done your MVP, iterate and improve by slowly adding complexity that work:
In other words:
MassMutual
RStudio Project -> coding.Rmd
-> # Tuesday, July 23 2019
-> ## Gentle Introduction to Kaggle Competitions
. Discussion on:
GitHub has many definitions that are unforunately not straight forward. Using fivethirtyeight
R package as an example.
README.md
files as cover pagesmaster
branch is what you see. Ex: Click “Branch” button on top leftmaster