Git Product home page Git Product logo

ec524w22's Introduction

EC 524, Winter 2022

Welcome to Economics 524 (424): Prediction and machine-learning in econometrics, taught by Ed Rubin and Andrew Dickinson.

Schedule

Lecture Monday and Wednesday, 10:00a–11:20a (Pacific), 220 Chapman

Lab Friday, 12:00p–12:50p (Pacific), 220 Chapman

Office hours

  • Ed Rubin Tuesdays, 3:30p–5:00p, Zoom
  • Andrew Dickinson Thursdays, 2:00p-3:00p, Zoom

Syllabus

Syllabus

Books

Required books

Suggested books

Lecture notes

000 - Overview (Why predict?)

  1. Why do we have a class on prediction?
  2. How is prediction (and how are its tools) different from causal inference?
  3. Motivating examples

Formats .html | .pdf | .Rmd

Readings Introduction in ISL

001 - Statistical learning foundations

  1. Why do we have a class on prediction?
  2. How is prediction (and how are its tools) different from causal inference?
  3. Motivating examples

Formats .html | .pdf | .Rmd

Readings

Supplements Unsupervised character recognization

002 - Model accuracy

  1. Model accuracy
  2. Loss for regression and classification
  3. The variance-bias tradeoff
  4. The Bayes classifier
  5. KNN

Formats .html | .pdf | .Rmd

Readings

  • ISL Ch2–Ch3
  • Optional: 100ML Preface and Ch1–Ch4

003 - Resampling methods

  1. Review
  2. The validation-set approach
  3. Leave-out-out cross validation
  4. k-fold cross validation
  5. The bootstrap

Formats .html | .pdf | .Rmd

Readings

  • ISL Ch5
  • Optional: 100ML Ch5

004 - Linear regression strikes back

  1. Returning to linear regression
  2. Model performance and overfit
  3. Model selection—best subset and stepwise
  4. Selection criteria

Formats .html | .pdf | .Rmd

Readings

  • ISL Ch3
  • ISL Ch6.1

In between: tidymodels-ing

005 - Shrinkage methods

(AKA: Penalized or regularized regression)

  1. Ridge regression
  2. Lasso
  3. Elasticnet

Formats .html | .pdf | .Rmd

Readings

  • ISL Ch4
  • ISL Ch6

006 - Classification intro

  1. Introduction to classification
  2. Why not regression?
  3. But also: Logistic regression
  4. Assessment: Confusion matrix, assessment criteria, ROC, and AUC

Formats .html | .pdf | .Rmd

Readings

  • ISL Ch4

007 - Decision trees

  1. Introduction to trees
  2. Regression trees
  3. Classification trees—including the Gini index, entropy, and error rate

Formats .html | .pdf | .Rmd

Readings

  • ISL Ch8.1–Ch8.2

008 - Ensemble methods

  1. Introduction
  2. Bagging
  3. Random forests
  4. Boosting

Formats .html | .pdf | .Rmd

Readings

  • ISL Ch8.2

009 - Support vector machines

  1. Hyperplanes and classification
  2. The maximal margin hyperplane/classifier
  3. The support vector classifier
  4. Support vector machines

Formats .html | .pdf | .Rmd

Readings

  • ISL Ch9

010 - Dimensionality reduction and unsupervised learning

  1. MNIST dataset (machines with vision)
  2. K-means clustering
  3. Principal component analysis (PCA)
  4. UMAP

Formats .html | .pdf | .Rmd

Projects

Planned projects

000 Predicting sales price in housing data (Kaggle)

Help:

001 Validation and out-of-sample performance

002 Penalized regression, logistic regression, and classification

003 Nonlinear predictors

004 Image/multi-class classification (MNIST)

Class project

Outline of the project

Topic and group due by February 16th.

Final project submission due by midnight on March 9th.

Lab notes

Approximate/planned topics...

000 - Workflow and cleaning

  1. General "best practices" for coding
  2. Working with RStudio
  3. The pipe (%>%)
  4. Cleaning and Kaggle follow up

Formats .html | .pdf | .Rmd

001 - Workflow and cleaning (continued)

  1. Finish previous lab on dplyr
  2. Working with projects
  3. Using dpylr and ggplot2 to make insightful visuals
  4. How to fix a coding error

Housing data download

Formats .html | .Rmd

002 - Validation

  1. Creating a training and validation data set from your observations dataframe in R
  2. Writing a function to iterate over multiple models to test and compare MSEs

Formats .html | .Rmd

003 - Practice using tidymodels

  1. Cleaning data quickly and efficiently with tidymodels

Formats .html

004 - Practice using tidymodels (continued)

  1. An introduction to preprocessing with tidymodels (refresher from last week)
  2. An introduction to modeling with tidymodels
  3. An introduction to resampling, model tuning, and workflows with tidymodels (will finish up next week)

005 - Summarizing tidymodels

  1. Summarizing tidymodels
  2. Combining pre-split data together and then defining a custom split

006 - Penalized regression in tidymodels + functions + loops

  1. Running a Ridge, Lasso or Elasticnet logistic regression in tidymodels.
  2. A short lesson in writing functions and loops in R)

007 - Finalizing a workflow in tidymodels: Example using a random forest

  1. Finalizing a workflow in tidymodels: Example using a random forest
  2. A short lesson in writing functions and loops in R (continued)

Additional resources

R

Data Science

Spatial data

ec524w22's People

Contributors

edrubin avatar ajdickinson avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.