- Can you predict which flights will be cancelled or delayed?
- Can you predict the delay time?
- Can you explore how different airlines compare?
This dataset makes all of these possible. Perfect for a school project, research project or resume builder.
his dataset contains all flight information including cancellation and delays by airline for dates back to January 2018.
For your convenience you can use the Combined_Flights_XXXX.csv
or Combined_Flights_XXXX.parquet
files to access the combined data for the entire year. These files also have filtered out columns that are mostly null in the original dataset.
The raw data including all columns by month can be found in the files named Flights_XXXX_X.csv
The data contained in the compressed file has been extracted from the Marketing Carrier On-Time Performance (Beginning January 2018) data table of the "On-Time" database from the TranStats data library. The time period is indicated in the name of the compressed file; for example, XXX_XXXXX_2001_1
contains data of the first month of the year 2001.
Insert the Hypothesis
- Installed Kaggle CLI tool
- Used kaggle-api Docs