Git Product home page Git Product logo

realestate's Introduction

Realestate

This repository contains a set of Python scripts that scrape a real estate webpage, clean and analyze the data, plot visualizations, and perform a multiple linear regression fit.

See also the associated report featured on Medium - Towards Data Science

Files

Web scraping

  • scrapeweb.py: uses Requests to connect to mlslistings, BeautifulSoup to pull verification token, html to get web content, Re to clean the results, and Pandas to store scraped content as a dataframe

  • getdata.py: pulls zipcodes from .csv file, uses webscrape function defined in scrapeweb.py to scrape content from the webpage and store it in Pandas dataframe, and writes a .csv file with the scraped content

Map plotting

  • plotmaps.py: pulls .csv file with listing information, uses price_quintiles function in calculatequintiles.py to place listings into five bins by price, uses cartoplot_x_price (x = bay, sf, eastbay, peninsula, southbay) functions defined in cartoplotfunctions.py to plot data points on a map of the respective region. Also contains scripts to plot commute and school quality data using zip code shapefiles

  • cartoplotfunctions.py: pulls data from .csv file and city or zipcode borders from shapefile, uses Matplotlib.pyplot and Cartopy to plot maps with terrain background and bounded by given set of latitude, longitude coordinates for full Bay Area as well as sub-regions

Boxplot plotting

  • plotboxplots.py: pulls data from .csv file and selects cities of interest to plot price information with using Seaborn box + strip plots

Data fitting

  • fitdata.py: pulls data from .csv file, filters outliers, uses Statsmodels.formula.api to perform ordinary least squares fit and summarize the result, uses Sklearn.linear_model to create price predictions using the fitted coefficients, and uses functions defined in plotfunctions.py to plot a histogram of the residuals

Libraries

Acknowledgement

Written by Michael Boles in summer of 2019 with help from the StackOverflow community.

realestate's People

Contributors

mboles01 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.