MATH 116: Fall 2016

  • Instructor: Albert Y. Kim - Assistant Professor of Statistics
  • Email: aykim@middlebury.edu
    • I will respond to emails within 24h, but not during weekends.
    • Please only email me with administrative and briefer questions as I prefer addressing more substantive questions in person.
  • Class Location/Time:
    • MWF 11:15–12:05 in Warner 506 and R 11:00-12:15 in Sunderland 202.
    • You do not need to inform me of absences. Please consult your peers for what you missed.
  • Office Hours: Warner 310 or the math lounge just outside. Feel free to come to the MATH 216 office hours as well, but those students get priority then.
    • M 2:30-4:00 and W 1:00-2:30
    • MATH 216: M 1:00-2:30 and W 2:30-4:00
    • Or by appointment

Course Description and Objectives

Description

A practical introduction to statistical methods and computational tools needed to make sense of data. This course is an evolution of many traditional introductory statistics courses in that computing plays a more central role than mathematics and a higher emphasis is placed on “thinking with data.” Topics include data visualization, data wrangling, confidence intervals, hypothesis testing, and regression. The course has no formal mathematics or computer science prerequisites, and is especially suited to students in the physical, social, environmental, and life sciences who seek an applied orientation to data analysis.

Objectives

  1. Have students engage in the data/science research pipeline in as faithful a manner as possible while maintaining a level suitable for novices.
  2. Foster a conceptual understanding of statistical topics and methods using simulation/resampling and real data whenever possible, rather than mathematical formulae.
  3. Blur the traditional lecture/lab dichotomy of introductory statistics courses by incorporating more computational and algorithmic thinking into the syllabus.
  4. Introduce best practices for reproducible research and collaboration.
  5. Develop statistical literacy by, among other ways, tying in the curriculum to current events, demonstrating the importance statistics plays in society.

Topics

Roughly speaking we will cover the following topics (a more detailed outline can be found here):

  1. Introduction and Tools (R, RStudio, and R Markdown)
  2. Data:
    • Data representation
    • Data visualization
    • Data manipulation/wrangling/munging
  3. Statistical inference:
    • Background and terminology
    • Confidence intervals
    • Hypothesis testing
  4. Regression:
    • Simple linear regression
    • Multiple regression

Materials

1) Textbooks

  • ModernDive: An Introduction to Statistical and Data Sciences by Kim and Ismay available online.
  • Intro Stat with Randomization and Simulation 1st edition by Diez, Barr, and Cetinkaya-Rundel available
    • Online at the OpenIntro webpage in both standard and tablet friendly PDF formats.
    • In print at the Middlebury bookstore and Amazon.

2) Computing and Software

  • Instead of using the desktop version of the RStudio interface to R, we will be using the cloud-based RStudio Server, which you can access in your browser via go/rstudio. Note if you are off-campus you must first log into the Middlebury VPN.
  • Just like your middfiles file server, you can access the files on the RStudio file server by logging onto cifs://rstudio.middlebury.edu

Evaluation

There are four components to your final grade: problem sets, engagement, midterms, and the final project.

1) Weekly Problem Sets 10%

The problem sets in this class should be viewed as low-stakes opportunities to develop one’s statistics and data science muscles and receive feedback on the progress of one’s learning, instead of viewing them as evaluative tools used by the instructor to assign grades. To reinforce this thinking, each problem set is worth only a nominal portion of the final grade.

While I encourage you to discuss problem sets with your peers, you must submit your own answers and not simple rewordings of another’s work. Furthermore, all collaborations must be explicitly acknowledged at the top of your submissions.

  • Assigned/due on Fridays.
  • Lowest two scores dropped.
  • No email submissions will be accepted; ask a classmate to print it for you.
  • No extensions for any problem sets will be granted.

2) Engagement 10%

It is difficult to explicit codify what constitutes “an engaged student,” so instead I present the following rough principle I will follow: you’ll only get out of this class as much as you put in. Some examples of behavior counter to this principle:

  • Not participating in in-class exercises.
  • Engaging so little, either in class or office hours, that I don’t know what your voice sounds like.
  • Submitting problem set that has code or content that is copied from (or only slightly modified versions of) your peers’ work, going against the philosophy of the problem sets being opportunities for practice and feedback, rather than as items to be graded on.

3) Three Midterms 45%

  • Midterm dates: Wed 10/5 (in-class), Wed 10/26 7:30pm, and Wed 11/16 7:30pm.
  • All midterms are cumulative and may require a scientific calculator, so please have access to one (no smartphones).
  • There is no extra-credit work to improve midterm scores after the fact.
  • There will be no make-up nor rescheduled midterms, except in the following cases if documentation is provided:
    • serious illness or death in the family.
    • athletic commitments or religious obligations if and prior notice is given. In such cases, rescheduled exams must be taken before the rest of the class.

4) Final Project 35%

Rather than a final exam, there will be a final capstone group project. This is an opportunity for you to flex your statistics and data science muscles developed during the problem sets and perform your own start-to-finish data analysis project. The project will involving you addressing a scientific question by choosing a data set, performing an analysis using the concepts and tools we have covered in this course, and writing a report.

  • Due Sun 12/18 at noon.
  • Groups of no more than three will be assigned by me.
  • A system will be put in place to hold your group peers accountable for their work.
  • Complete project details and groups will be addressed on Wed 10/26.

Academic Accommodations for Disabilities

Students with documented disabilities who believe that they may need accommodations in this class are encouraged to contact me as early in the semester as possible to ensure that such accommodations are implemented in a timely fashion. Assistance is available to eligible students through Student Accessibility Services. Please contact Jodi Litchfield, the ADA coordinator, at litchfie@middlebury.edu or 802.443.5936 for more information. All discussions will remain confidential.