Git Product home page Git Product logo

fannie-wong / predict_real_estate_prices Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mbkraus/predicting_real_estate_prices_using_scikit-learn

0.0 0.0 0.0 424 KB

Predicting Amsterdam house / real estate prices (in Python) using Linear Regression, KNN-, Lasso-, Ridge-, Polynomial-, Support Vector (SVR)-, Decision Tree-, Random Forest-, and Neural Network MLP Regression.

Python 100.00%

predict_real_estate_prices's Introduction

Predicting Amsterdam house / real estate prices using Linear Regression, KNN-, Lasso-, Ridge-, Polynomial-, Support Vector (SVR)-, Decision Tree-, Random Forest-, and Neural Network MLP Regression.

Approach:

  • load Pandas DataFrame containing (Dec-17) housing data retrieved by means of the following scraper, supplemented with longitude and latitude coordinates mapped to zip code (via GeoPy)
  • do some simple data exploration / visualisation
  • remove non-numeric data, NaNs, outliers and normalise data
  • define explanatory variables (surface, rooms, latitude, longitude) and independent variable (price EUR)
  • split the data in train and test set for later usage
  • find the optimal model parameters using scikit-learn's GridSearchCV
  • fit the model using GridSearchCV's optimal parameters
  • evaluate estimator performance by means of 10 fold 'shuffled' cross-validation

Packages required

Results along (Dec-17) Amsterdam house / real estate price data retrieved by means of the following scraper

Sample data input (Pandas DataFrame)

   surface  rooms_new  zipcode_new  price_new   latitude  longitude
0    138.0        4.0         1060     420000  40.804672 -73.963420
1    130.0        5.0         1087     550000  52.355590   5.000561
2    116.0        5.0         1061     425000  52.373044   4.837568
3     92.0        5.0         1035     349511  52.416895   4.906767
4    127.0        4.0         1013    1050000  52.396789   4.876607

Scores (10 fold 'shuffled' cross-validation - Rsquared)

  • Random Forest Regression (n_estim=20, max_depth= None, max_feat=4} 0.866
  • Polynomial Regression (degrees=4) 0.810
  • Decision Tree Regression (max_depth=4, min_samples_leaf=6) 0.737
  • Neural Network MLP Regression (layer =[3,3], alpha=5, solv=lbfgs) 0.721
  • KNN Regression (n-neighbors = 15) 0.704
  • Ordinary Least-Squares Regression: 0.695
  • Ridge Regression (alpha = 0.1) 0.695
  • Support Vector Regression (kernel='linear', gamma = 0.001, C= 10) 0.690
  • Lasso Regression (alpha = 0.25) 0.614

Scatter plot - Surface vs. Asking Price (EUR)

alt text

OLS - Predicted prices vs. True price (EUR)

alt text

KNN Regression - Validation curve

alt text

Lasso Regression (Alpha = 0.25) - Predicted prices vs. True price (EUR)

alt text

Ridge Regression - Mean Test Score vs. Alpha

alt text

Polynomial Regression (degrees = 4) - Predicted prices vs. True price (EUR)

alt text

Random Forest Regression (n_estim=20, max_depth= None, max_feat=4) - Predicted prices vs. True price (EUR)

alt text

Support Vector Regression (kernel='linear', gamma = 0.001, C= 10) - Predicted prices vs. True price (EUR)

alt text

Neural Network MLP Regression (layer =[3,3], alpha=5, solv=lbfgs) (max_depth=4, min_samples_leaf=6) - Predicted prices vs. True price (EUR)

alt text

predict_real_estate_prices's People

Contributors

mbkraus avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.