Git Product home page Git Product logo

congressional-bill-prediction's Introduction

Congressional-Bill-Prediction

Overview

An analysis 20 years of bills introduced across 10 Congresses (1997-2017). Only 3.4% of the 108,000 bills in the dataset became law, and only 18% of the bills in the dataset recieved any action besides getting referred to committee.

The project involved scraping and engineering 20 years of Congressional data in an attempt to predict whether a bill will make it out of committee, and whether a bill will become a law. Assigned each bill to one of five "final status" categories based on the final action on each bill included in the dataset. Predicted if a bill would make it out of committee with 77% recall using a Random Forest Classifier. Built various other models included in this repo with varying levels of success -- overall most outcomes are hard to predict and recall rates are low while accuracy rates are high (decievingly so) because of the unbalanced classes inherent in the dataset.

Process

I used the ProPublica API to scrape congressional bills from 1997 to 2017, or Congresses 105-115. This data was combined with MIT researcher Christ Stewart's congressional committee data to create the final dataset.

This blog describes the data gathering, cleaning, and engineering process: (https://medium.com/@elizabethjafek/scraping-and-wrangling-congressional-bill-data-bddc53b5cc93)

This blog describes the modeling process: (https://medium.com/@elizabethjafek/predicting-the-path-of-congressional-bills-cf1105f7c6c8)

Directory

Data Scrape Function

Data Combining Function

EDA and Data Wrangling

Statistics -- correlation/association scores and p-values for many variables

Adding Additional Features (majority/minority in each Congress analyzed)

Models -- Jupyter Notebooks with various models

  1. Modeling if bill becomes law
  2. Modeling if any action taken on bill besides being referred to committee -- includes second stage model
  3. Modeling which of the five categories a bill will end up in
  4. Modeling which of three categories a bill will end up in (referred to committee, action taken, law)

Presentation

Resources

Link to Propublica API (https://projects.propublica.org/api-docs/congress-api/)

Link to MIT dataset (http://web.mit.edu/cstewart/www/data/data_page.html)

congressional-bill-prediction's People

Contributors

ejafek13 avatar

Stargazers

Andy Mockler avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.