Git Product home page Git Product logo

syntaxerror's Introduction

syntaxerror-mlpp2018

Contributors: Andrew Deng, Amir Kazi, Tianchu Shu and Jessica Song

Project goal

Syntax Error aims to identify which inmates are most at risk of recidivism following their release within one year or two year, using personal data, mental health records from Johnson county jail system and demographic data of neighborhood areas

Content

  • Data exploration
  • Data cleaning and preprocessing
  • Feature generation
  • Machine learning pipeline
  • Model evaluation and bias analysis

Package to install

  1. pandas
  2. psycopg2
  3. numpy
  4. matplotlib
  5. seaborn
  6. graphviz
  7. sklearn
  8. datetime
  9. requests

Data Sources

Johnson County Jail, Census Bureau

What We've Done

We have worked on

  1. Data Exploration
  2. Data Cleaning, Data Integration, and Pre-Processing
  3. Feature Generation
  4. Machine learning Classifiers & Evaluations (Pipeline)

Code

  1. Jupyter notebook

Final modeling results

  • Person_societal_var.ipynb
  • Bail_info.ipynb
  • mh_info.ipynb
  • All-var.ipynb
  1. Python code ("/code")

Code to retrieve census data to be used as demographic data

  • census.py
  • run_api.py

Settings for the pipeline

  • final_default_grids.py
  • jocojims.py
  • indpv_lists.py

Code for the final pipeline

  • final_connection.py
  • final_load_dfs.py
  • final_explore_and_viz.py
  • final_preprocessing.py
  • final_temporal.py
  • final_classifier_final.py
  • final_plot.py
  • final_pipeline.py

Gathered functions for each part of pipeline and put together except data exploration

  • final_run.py

Running the code

  • Via Jupyter Notebook

  • Via Python file & Terminal

    python final_run.py 
    

Summary

In general, all work for preliminary steps of the project is located in raw, all code for the pipeline can be found in code with the prefix 'final' (eg. final_plot).

Results

We compared the effectiveness of different models trained on “biased” and “unbiased” feature sets:

  1. All variables (Using all our variables for training data)
  2. Personal & Mental Health related Variables 11
  3. Personal & Bail related Variables
  4. Personal related Variables.

syntaxerror's People

Contributors

amirkazi avatar cappandrew avatar jessicajysong avatar rayidghani avatar tianchu-shu avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.