Git Product home page Git Product logo

adityarc19 / aqi-india Goto Github PK

View Code? Open in Web Editor NEW
5.0 1.0 1.0 4.11 MB

This is a project that takes a time series AQI India dataset from Kaggle and performs EDA on it. Additionally, predictive classification is done to classify AQI levels based on the pollutant metrics.

Jupyter Notebook 100.00%
kaggle-dataset aqi pollution decision-tree-classifier machine-learning eda visualization time-series

aqi-india's Introduction

Air Quality Index (AQI) across stations and cities in India from 2015 to 2020

aqi-logo


I have taken an AQI dataset from Kaggle and performed some EDA on it as well as implemented a decision tree classiifer to classify the air quality into one of the six buckets:

  1. Good
  2. Moderate
  3. Satisfactory
  4. Poor
  5. Very Poor
  6. Severe
  • The dataset is taken from Kaggle.
  • It contains air quality data and AQI (Air Quality Index) at hourly and daily level of various stations across multiple cities in India from 2015 to 2020.
  • For this particular project, I have used just a part of the datasets provided in Kaggle, which contains day-wise city air pollution data.

Libraries used:

1. Numpy
2. Pandas
3. Seaborn
4. Chart Studio
5. Plotly
6. Pandas Profiling
7. PyCaret

I have used Pandas Profiling for performing exploratory data analysis and PyCaret for performing the machine learning classification task. Below are their installation commands:

For Pandas Profiling:

pip install pandas-profiling[notebook]

or

pip install https://github.com/pandas-profiling/pandas-profiling/archive/master.zip

or

conda install -c conda-forge pandas-profiling

For PyCaret:

#create a conda environment
conda create --name yourenvname python=3.6

#activate environment
conda activate yourenvname

#install pycaret
pip install pycaret

#create notebook kernel connected with the conda environment
python -m ipykernel install --user --name yourenvname --display-name "display-name"

* Some EDA

1. Dataframe

head

2. AQI bucket chart

bucket

3. Pearson's correlations

corr

4. Most polluted cities

pol

5. Least polluted cities

poll

6. City wise pollutants analysis

city

where BTX = Benzene + Toluene + Xylene

7. Yearly analysis

yearly


Credits-

I would like to thank Parul Pandey as well as Naresh Bhat for providing amazing data exploration techniques from which I've pulled some here.

  1. Parul Pandey's notebook: https://www.kaggle.com/parulpandey/breathe-india-covid-19-effect-on-pollution
  2. Naresh Bhat's notebook: https://www.kaggle.com/nareshbhat/air-quality-analysis-eda-and-classification

* Classification model used: Decision tree

I have used 'decision tree' as a classification model for this prediction problem based on the following results:

mod

Using decision tree for classification, confusion matrix for validation data:

cm


aqi-india's People

Contributors

adityarc19 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

cmporeddy

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.