Git Product home page Git Product logo

housing_market_prediction's Introduction

Housing Market Prediction

We use economic data to predict housing market trends, more specifically crashes. We also use sentiment analysis using live tweets to track general feelings about housing market and what the general talk around the project is. We start off with the sentiment analysis and then dive into the economic data.

Sentiment Analysis on Housing Market

The sentiment analysis is largely dependent on Twitter. We user the vader analysis for the sentiment analysis, however we do create a custom list of stopwords. Those custom words are then added to a list of custom words imported from the NLTK library to later create a word cloud.

Set Up API

Set up Tweepy with required tokens and access keys. Using Api, we created a function that pulls Tweets from Twitter and does a sentiment analysis of those Tweets. image

Keyword search

Created function that allows the input of any keyword (can also be hashtag) and searches a requested amount of Tweets, as related to keyword and number inputed. The output is a list of raw Tweets containing the inputted keyword image

Sentiment Analysis

Dataframe was created containing Tweets with "positive", "negative", and "neutral" sentiment. Created a function that spits out the count of how many Tweets are in each dataframe image

Raw Tweets

The variable "tweet_list" contains a list of the most recent tweets as described by the parameters inputted in the "keyword search" image

Stopwords

Stopwords were imported from nltk.corpus. We also created a for loop that iterated through each tweet to find words that were frequently mentionned. These words could have been a list of adverbs, hashtags, or verbs that don't add much syntax to the project, for example: "a, #housingmarket, realestate, isn't." The goal in finding frequently mentionned words was to create a custom list of stopwords, so we could find more "valuable" words that are mentionnend when a specifici key word is mentionned. image

Processing Tweets

Created a function that cleans tweets and removes stopwords image

Wordcloud

After each tweet has been processed and cleaned for stopwords, a wordcloud is generated containing words that showed up often. The goal of the wordcloud is to see what people say when a specific keyword is searched. image

Wordcloud #2

This wordcloud was conducted a week later to see if their were common words that showedup. image

Economic Data on Housing Market

Most of the data used is public data from Fanny Mac. The data contains fixed and adjusted mortgage rates for houses starting from 1971. The data also contains 15 year and 30 year interest rates, as well as the margin of profit that banks make on those loans. We then run various regression models to create predictions and understand trends.

Median Home Price

image

image

Number of Homes Sold (in Millions)

image

image

Number of New Homes Sold

image

Mortgage Applications Submitted

image

Interest Rates

The data from interests rates was later merged with another dataframe containing the number of houses purchased in each region of the US, starting from the 1970s. The data was merged in order to facilitate the view of the dataframe and to also create a linear regression model. image

Random Forest Regressor

Using the previously mentionned dataframe, we run a random forest regression to create a predictive model.A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is controlled with the max_samples parameter if bootstrap=True (default), otherwise the whole dataset is used to build each tree. X contained interest rates and margin, and y contained houses bought in the US. According to the result, purchases were mostly related to margin than they were related to housing interest rates. image

Linear Regression

We also ran a linear regression and concluded that about 48% of the time, the data can can explain the trend in houses bought image

Deep Learning Model

Understanding the links between multiple economic indicators and their influence on mortgage rates we used 8 datasets to create this model including Inflation(CPI), Changes in Mortgage Back Securities Prices, Avg Wages, the Fed Funds rate, number of houses sold, Unemployment rates, and average adjustable and fixed rated mortgages.

DL_Code DL_Code2 DL_Code3

Inflation

Inflation_df

Mortgage Backed Securities

MBS_df

Fed Funds Rate

fed_funds

All Dataframes Combined

Library_data Combined_df

Relationship between Fixed and Adjustable Rate Mortgages

FvsA_df FvsA

Price to Interest Rate Relationship

PricevsInterest

Results

DL_Results DL_df

Conclusion

Although sentiment may say the US housing market is on the verge of a crash. The data says otherwise. With the Fed keeping interest rates astronomically low, there is no reason to predict that prices will go down. Despite other economic indicators including rising GDP, rising inflation, low unemployment, more government spending, and wages increasing the Federal Reserve is intent on keeping interest low to keep both stock and housing markets on the rise.

housing_market_prediction's People

Contributors

jrrameau2000 avatar barney359 avatar tavarisjones avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.