Git Product home page Git Product logo

crashes-data-project's Introduction

Data Science Project - car accidents

Motivation

In the realm of Data Science, a comprehensive project focused on car accidents unfolds, leveraging the power of data analysis and predictive modeling.

This endeavor aims to delve into the intricate patterns and underlying factors contributing to vehicular accidents.

This Data Science project seeks to extract valuable insightsand pave the way for effective accident prevention strategies. Through a meticulous exploration of extensive accident data, this undertaking aims to provide a deeper understanding of the complex dynamics at play, ultimately fostering safer roadways and improving the well-being of individuals and communities.

Research questions

In this section, we aim to explore the significance and value of our project. By addressing this question, we can subsequently progress towards elucidating the appropriate methods and strategies to effectively reduce car accidents. Our objective is to leverage knowledge and insights gained from this research to implement proactive measures and promote safety on the roads.

  1. What are the primary causes of car accidents in a specific region?
  2. How do weather conditions contribute to car accidents?
  3. How does the time of day or day of the week influence the likelihood of car accidents?
  4. Are there more accidents that occur during holidays?

This endeavor aims to delve into the intricate patterns and underlying factors contributing to vehicular accidents. This Data Science project seeks to extract valuable insightsand pave the way for effective accident prevention strategies. Through a meticulous exploration of extensive accident data, this undertaking aims to provide a deeper understanding of the complex dynamics at play, ultimately fostering safer roadways and improving the well-being of individuals and communities.

Data acquistion

All data is sourced from formal databases, which are based on real events. All data acquistion is stored in Connectors folder.

API Gov: https://https://data.gov.il/api/3/action/datastore_search?

 # Sample code
def get_data_from_gov():
    
    import requests
    
    uri = "https://data.gov.il/api/3/action/datastore_search" 
    query = "resource_id=5c78e9fa-c2e2-4771-93ff-7f400a12f7ba"  # query by resource_id
    query.append("&limit=99999") # set a records limit (by default: 1000)
    
    url = uri + query
    
    response = requests.get(url)

    if response.status_code == 200:
        records = response.json()['result']['records']
    
    return records

ZIP files via CSB: https://www.cbs.gov.il/he/publications/Pages/2015/%D7%AA%D7%95%D7%A6%D7%A8%D7%99-Public-Use-Files-PUF-%D7%A0%D7%AA%D7%95%D7%A0%D7%99-%D7%A4%D7%A8%D7%98-%D7%91%D7%9C%D7%AA%D7%99-%D7%9E%D7%96%D7%95%D7%94%D7%99%D7%9D-%D7%9C%D7%9E%D7%97%D7%A7%D7%A8.aspx

def extract_data_zip(zip_path, zip_name, csv_file, selected_params=None):
        
        import os
        import requests
        import pandas as pd
        import zipfile
 
        response = requests.get(zip_path)

        with open(zip_name, 'wb') as f:
            f.write(response.content)

        with zipfile.ZipFile(zip_name, 'r') as zip_ref:
            zip_ref.extract(csv_file)

        data = pd.read_csv(csv_file)

        if selected_params:
            data = data[selected_params]

        os.remove(csv_file)
        os.remove(zip_name)

       return data

EDA

EDA is crucial in any data science project as it helps us understand the data, identify patterns, and uncover insights. By conducting EDA, we gain valuable insights that guide us in making informed decisions and developing accurate models.

All related files are in the EDA folder.

We used the following libraries:

import seaborn as sns
import matplotlib.pyplot as plt

Machine learning

In this section, we used logistic regression in an attempt to answer our main research question: Can we predict a car accident? We used the following libraries:

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split ,GridSearchCV
from sklearn.metrics import accuracy_score
from sklearn import metrics

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.