Git Product home page Git Product logo

nyc-taxi-price-prediction's Introduction

nyc-taxi-price-prediction

image

  1. Introduction

    In this project centered around New York City's significant traffic challenges and the popularity of taxis as a mode of transport, we aimed to predict taxi trip durations using the "NYC taxi trip duration dataset". With NYC's enormous population of approximately 18 million people, traffic congestion is a pressing issue, making accurate trip duration predictions essential for travelers to plan their journeys effectively. My project encompasses insightful visualizations illustrating the relationship between trip duration and various variables. Additionally, I explored how mean passenger count, total trips, distance, and duration fluctuate across different weekdays. To enhance the predictive power of the models, we employed extensive hyperparameter tuning techniques alongside machine learning models, including linear regression and decision tree. My project's journey begins with a thorough dive into data preprocessing, paving the way for enhanced insights and predictions.

  2. Data Preprocessing

    During the data preparation phase, I conducted a series of crucial operations to refine the dataset. The distance is calculated between pickup and dropoff locations by computing the geographical distance based on latitude and longitude coordinates. Additionally, I transformed the pickup and dropoff date-time information into a datetime format. To further enhance the dataset's suitability for machine learning, I binned the time data into six distinct bins, allowing for a more granular analysis of temporal patterns. Additionally, several preprocessing steps were meticulously applied to the data, ensuring its readiness for utilization in my machine learning models. These preparatory measures laid the foundation for more accurate and insightful predictions.

  3. Exploratory Data Analysis

    The exploratory data analysis unveiled compelling insights within the dataset. It became evident that the majority of cab rides involved one or two passengers, indicating a prevalent usage pattern. Surprisingly, Fridays saw more cab usage than Mondays, hinting at shifting travel behaviors toward the end of the week. Furthermore, the analysis revealed that evening pickups and nighttime drop-offs were most frequent, with relatively lighter traffic in the early morning hours. The histogram emphasized the preference for evening and nighttime rides. Vendor id: 2 attracted a larger customer base compared to Vendor id: 1, suggesting a preference among passengers. Trip durations peaked on Thursdays and Fridays, particularly during the afternoon and evening hours, likely due to increased traffic. These findings offer valuable insights into travel patterns and customer preferences, providing a solid foundation for informed decision-making and predictive modeling in the project.

  4. Model Building

    In the model-building phase, I conducted a series of essential steps to construct accurate predictive models. Firstly, I applied standard scaling to the dataset, ensuring that features were appropriately scaled for modeling. Subsequently, I partitioned the dataset into training and testing subsets, facilitating robust model evaluation. The approach involved employing both linear regression and decision tree algorithms. I meticulously fine-tuned model parameters using hyperparameter tuning techniques, optimizing their performance. These steps showcase the dedication to rigorous model development and data-driven decision-making in this project.

  5. Model Evaluation

    In the model evaluation phase, I employed a comprehensive approach to fine-tune the decision tree algorithm. Specifically, I conducted hyperparameter tuning by meticulously optimizing the tree's depth, setting it to 11. This meticulous tuning process aimed to enhance the model's performance and predictive accuracy. As a result of the efforts, I successfully trained the model and achieved a commendable accuracy of 62% on the test dataset. These results underscore my commitment to precision and excellence in model development and evaluation within this project.

nyc-taxi-price-prediction's People

Contributors

akanksha1195 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.