Git Product home page Git Product logo

flight-price-prediction's Introduction

Flight Price Prediction

Transportation such as ships and airplanes has provided us with many benefits. For example, these transportations enable us to ship trade items from one country to another, thereby improving the economy of the country. Besides, they also encourage the development of the tourism sector. Airfare price is often an important factor for those individuals who wish to travel around with low budget.

Objective & Research Questions πŸ€”

  • To analyse the flight booking dataset obtained from β€œEase My Trip” website. The possible related research questions could be:
    • Does price vary with Airlines?
    • How is the price affected when tickets are bought in just 1 or 2 days before departure?
    • Does ticket price change based on the departure time and arrival time?
    • How the price changes with change in Source and Destination?
    • How does the ticket price vary between Economy and Business class?
  • To build regression model that could predict the airfare price.

Source of Dataset πŸ“…

The dataset was provided by Shubham Bathwal on Kaggle. It contains information about flight booking options from the website Easemytrip for flight travel between India's top 6 metro cities. There are 300,153 datapoints and 10 features in the cleaned dataset.

  1. Airline: The name of the airline company is stored in the airline column. It is a categorical feature having 6 different airlines.
  2. Flight: Flight stores information regarding the plane's flight code. It is a categorical feature.
  3. Source City: City from which the flight takes off. It is a categorical feature having 6 unique cities.
  4. Departure Time: This is a derived categorical feature obtained created by grouping time periods into bins. It stores information about the departure time and have 6 unique time labels.
  5. Stops: A categorical feature with 3 distinct values that stores the number of stops between the source and destination cities.
  6. Arrival Time: This is a derived categorical feature created by grouping time intervals into bins. It has six distinct time labels and keeps information about the arrival time.
  7. Destination City: City where the flight will land. It is a categorical feature having 6 unique cities.
  8. Class: A categorical feature that contains information on seat class; it has two distinct values: Business and Economy.
  9. Duration: A continuous feature that displays the overall amount of time it takes to travel between cities in hours.
  10. Days Left: This is a derived characteristic that is calculated by subtracting the trip date by the booking date.
  11. Price: Target variable stores information of the ticket price.

Result πŸ”Ž

  • It is found that only Vistara and Air India provide flight with business class, but the maximum price for the business flight of the Vistara is higher than that of the business flight of the Air India.
  • When we compared the maximum economy flight price from each airline, we found that economy flight ticket from the AirAsia, GO First, and Indigo offered the same lowest price, which is $1,105.00.
  • The maximum duration for flight with no stops is only 3.58 hours. The flight was AI-773 from Air_India which fled from Kolkata at evening and arrived Mumbai at night.
  • Flight with business class, two or more stops and early morning departure time have the highest maximum price as compared to other flights. There is no business class flight with two or more stops that depart at late night.
  • The flight price became higher when the days left before departure is getting lesser.
  • The final model chosen is a random forest regressor, and below is the screencast of how the developed streamlit app run on my local PC.
streamlit-app-streamlit-google-chrome-2022-10-28-00-17-48_nKZIibsh.mp4

Recommendation πŸ“₯

  • We could implement other ML algorithms, such as KNeighborsRegressor, decision tree regressor, or XGB regressor.
  • Ensemble methods such as stacking regressor and voting regressor could be used to check if the model performance could be further improved.
  • Randomized Search could be used to tune the hyperparameters to see if the hyperparameter combination is the same.

flight-price-prediction's People

Contributors

jadanpl avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.