Git Product home page Git Product logo

disaster-response-pipeline's Introduction

Disaster Response Pipeline Project

Introduction

This project is part of the Udacity Data Scientist Nanodegree program and involves the implementation of a natural language processing tool to analyze and classify disaster messages compiled by Figure Eight.
The project comprises the following sections:

  1. Building an ETL Pipeline that
    • Loads the messages and categories datasets
    • Merges the two datasets
    • Cleans the data
    • Stores it in a SQLite database
  2. Building an ML Pipeline that
    • Loads data from the SQLite database
    • Splits the dataset into training and test sets
    • Builds a text processing and machine learning pipeline
    • Trains and tunes a model using GridSearchCV
    • Outputs results on the test set
    • Exports the final model as a pickle file
  3. Building a Flask Web App that
    • Visualizes features of the training data
    • Classifies messages entered by the user according to the ML model built in step 2

Instructions for Execution:

  1. Run the following commands in the project's root directory to set up the database and model.

    • To run ETL pipeline that cleans data and stores in database python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db
    • To run ML pipeline that trains classifier and saves python models/train_classifier.py data/DisasterResponse.db models/classifier.pkl
  2. Run the following command in the app's directory to run the web app. python run.py

  3. Go to http://0.0.0.0:3001/

File structure

│       ETL Pipeline Preparation.ipynb - Jupyter Notebook with ETL Pipeline
│       ML Pipeline Preparation.ipynb - Jupyter Notebook with Machine Learning Pipeline
│       README.md
│       
+------app
│       │       run.py - Flask app

│       +------templates
│       │               go.html - Template file
│       │               master.html - Main template file
|
+-----data
│       │       DisasterResponse.db - SQLite database with result of ETL Pipeline
│       │       disaster_categories.csv - CSV data file
│       │       disaster_messages.csv - CSV data file
│       │       process_data.py - ETL script file
|
+------models
        │       train_classifier.py Machine Learning pipeline script

disaster-response-pipeline's People

Contributors

normannexo avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.