Git Product home page Git Product logo

bike_service_project's Introduction

About

Data project about the public bike service traffic in New York.

Objective

This is a data project I made when I learned to use seaborn and matplotlib libraries.
The objective was to rely entirely on those two libraries to create data visualizations.
See notebook here.

Tools

Material

The material used is a csv file with data from Citi Bike, a bike services company from New York.
Data used for this project is from January 2023 (source).
The file consists of several columns with information such as:

  • ride IDs
  • station names
  • date and time for departures and arrivals
  • latitude/longitude coordinates

Test

Context

Managing traffic flows is one of the main challenges of civil engineering, hence why usage of data and especially real time data is essential to better understand the patterns of traffic flows.
In a metropolis like New York, road traffic can change heavily with weather conditions, national holidays, seasons, events, public renovations.
Over the last decade, public bike services have grown popular as a commuting mean, changing the way road network is designed.
By leveraging public bike services data, we can better understand how people use this service and what they expect from it. Analyzing this data is crucial to adapt capacity and density of the bike service, so that it suits users' needs and habits.

Data

Data was prepared to enable the creation of the visualizations below:

  • duration distribution
  • hour frame distribution
  • weekday distribution
  • usage/distance relation

Duration distribution

Bar chart displaying the distribution of rides by duration categories.

Test

Three categories were created:

  • 0 to 5 minutes, representing "short" rides
  • 5 to 10 minutes, representing "medium" rides
  • 15 minutes and over, representing "long" rides

This visualization shows the high level data of usage habits and highlights the prevalence of medium-length rides.

Hour frame distribution

Line chart displaying the mean value of ride departures by hour frame.

Test

The goal was to get insights about the evolution of traffic throughout the day, showing which hour frames had the highest traffic and which had the lowest.
This line chart uses the aggregated values of the dataset to display the mean value for each hour frame. Additionnally, a confidence interval is displayed around the line, showing the estimation range for data points.
This type of visualization can be crucial to assess the capacity of each station throughout the day.

Weekday distribution

Bar chart displaying the distribution of rides by day of the week, with a confidence interval showing an estimation range.

Test

Usage/distance relation

Scatter plot displaying the 10 ride routes with the most rides.
A ride route is defined as a ride starting from a station A and ending at a station B.

Below is a dataframe with the 10 most used ride routes.

Test

Since the dataset only contains latitude and longitude coordinates, the distance was calculated using Google Maps Routes API. The benefit of this step was to retrieve the actual distance, taking into account the road network for more precise distance calculation.

Test

Once the distance values are retrieved, data can be plotted on a scatterplot to display the 10 most popular routes.

Test

When scaled to a complete year and to all the ride routes, this type of visualization can help us understand the overall usage habits of the bike service users. Leveraging this data would also be valuable to better manage the existing stations and estimate the best areas to target in the creation of new stations.

bike_service_project's People

Contributors

florianld avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.