Git Product home page Git Product logo

dsc-multiple-linear-regression-statsmodels-lab-london-ds-100719's Introduction

Multiple Linear Regression in Statsmodels - Lab

Introduction

In this lab, you'll practice fitting a multiple linear regression model on the Ames Housing dataset!

Objectives

You will be able to:

  • Determine if it is necessary to perform normalization/standardization for a specific model or set of data
  • Use standardization/normalization on features of a dataset
  • Identify if it is necessary to perform log transformations on a set of features
  • Perform log transformations on different features of a dataset
  • Use statsmodels to fit a multiple linear regression model
  • Evaluate a linear regression model by using statistical performance metrics pertaining to overall model and specific parameters

The Ames Housing Data

Using the specified continuous and categorical features, preprocess your data to prepare for modeling:

  • Split off and one hot encode the categorical features of interest
  • Log and scale the selected continuous features
import pandas as pd
import numpy as np

ames = pd.read_csv('ames.csv')

continuous = ['LotArea', '1stFlrSF', 'GrLivArea', 'SalePrice']
categoricals = ['BldgType', 'KitchenQual', 'SaleType', 'MSZoning', 'Street', 'Neighborhood']

Continuous Features

# Log transform and normalize

Categorical Features

# One hot encode categoricals

Combine Categorical and Continuous Features

# combine features into a single dataframe called preprocessed

Run a linear model with SalePrice as the target variable in statsmodels

# Your code here

Run the same model in scikit-learn

# Your code here - Check that the coefficients and intercept are the same as those from Statsmodels

Predict the house price given the following characteristics (before manipulation!!)

Make sure to transform your variables as needed!

  • LotArea: 14977
  • 1stFlrSF: 1976
  • GrLivArea: 1976
  • BldgType: 1Fam
  • KitchenQual: Gd
  • SaleType: New
  • MSZoning: RL
  • Street: Pave
  • Neighborhood: NridgHt

Summary

Congratulations! You pre-processed the Ames Housing data using scaling and standardization. You also fitted your first multiple linear regression model on the Ames Housing data using statsmodels and scikit-learn!

dsc-multiple-linear-regression-statsmodels-lab-london-ds-100719's People

Contributors

loredirick avatar mas16 avatar sumedh10 avatar

Watchers

James Cloos avatar  avatar Mohawk Greene avatar Victoria Thevenot avatar Bernard Mordan avatar Otha avatar raza jafri avatar  avatar Joe Cardarelli avatar The Learn Team avatar Sophie DeBenedetto avatar  avatar  avatar Matt avatar Antoin avatar  avatar Alex Griffith avatar  avatar Amanda D'Avria avatar  avatar Ahmed avatar Nicole Kroese  avatar Kaeland Chatman avatar Lisa Jiang avatar Vicki Aubin avatar Maxwell Benton avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.