View the entire NYC Citibikes Story on Tableau Public here for a more expanded explanation of the findings.
- Data: from the CitiBike System Data page, the selected set is 201908-citibike-tripdata.csv.zip
- Tableau Public 2022.2
- Jupyter Notebook 6.4.8
- Pandas library - trip duration data from the 201908-citibike-tripdata.csv file was converted to a datetime datatype (here) before uploading to Tableau.
Visualizations using the NYC CitiBikes dataset were prepared for an investor meeting to demonstrate proof of concept for a similar bikesharing venture in Des Moines, Iowa. The selected dataset represents the month of August (2019) which was determined to be an optimal month for bike rentals.
Total ridership for August: 2,344,224 total rides
Breakdown of subscribers vs. short-term customers: The primary client type is male subscribers age 25-35
Daily peak times for trips: Usage patterns differ between weekdays and weekends.
Weekdays: Peak times spike at 8a and 5p, typical commuter time periods
Weekends: There is a more gradual mid-day build between 10a-7p with a steady peak from 12-5p
Trip duration for all riders and genders: The majority of trips are less than an hour in duration with a significant spike at 5-6min duration.
Number of trips taken by the hour for each day of the week for all riders:
This heatmap can be used to approximate the number of bikes needed to cover peak usages.
With breakout for gender:
Days of the week is a user more likely to rent a bike:
Individual bike usage (as measured by total hours in service and total number of trips)
The client base is primarily male subscribers most likely making commuter trips during the week (8a and 5pm) with a typical commute time of 5-6 mins. The non-subscriber customer base is most active in a more spreadout pattern across midday.
The area of highest activity is in lower Manhattan and around Central Park, with fewer trips in Brooklyn and Queens
Individual bikes requiring maintenance can be identified either by their total hours in service or number of trips made.
Given that the most frequent ride duration is 5min and the majority are completed in under an hour, we can use the heatmaps to estimate that about 45,000 bikes are needed to accomodate peak ridership.
There are some datapoints which may be skewing certain analyses. Birth year has many points outside the expected age range of a user. There is also a concentration of datapoints at age 50 and unknown gender. These are likely due to false or omitted demographic inputs by the user.
View the story on Tableau to see additional visualizations showing geographic utilization and checkout durations by age (and more!)