Git Product home page Git Product logo

data_science_report's Introduction

README

Purposes

To analyse the household power consumption data set to gain some information from it.

  1. Predict the power consumption of days in the next years, which should be able to use regression
  • Find the trend of power consumption
  1. Find the pattern how the household consume the power and give some adivce, which will need association rules

Tasks

  1. Load data
  • load in dictionary
  • convert them to proper data type
  1. Deal with missing data
  • Find missing data a. missing data 2017.04.28-30
  • visual the missing data as a Graph
  • drop it
  • mark the data, data provenance
  1. Explore the data
  • tasks when exploring
    • summary the data
    • find outliers, and whether it make sense or not
    • find data range
    • with 2 ways a. box plot b. scatter graph
  1. Preprocess
  • seperate the time
  • calculate the rate and total Power
  • a list of group by time
  • deal with data a. visualize rate and intensity and voltage b. portion between four usage of electric c. totoal power against time d. voltage against intensity
  1. Build the model a. regression i. power consumption vs time, one variable regression ii. efficent rate = active/reactive+active with submetering and the relationship with intensity multiple regression b. association rules i. the user's pattern - translate it into different pattern - work with it
  2. validate it
  3. Report

Packages

  1. mice
  2. VIM

Record

  1. Deal with missing a. only time and sub_metering_3 has the missing data. b. Time missing data is only 120, should be randomly c. There are 25979 missing data for Sub_Metering_3, the rate it low d. but the box plot show the differences with the origin data, so we need to exanime it.

data_science_report's People

Contributors

pascalsun avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.