Git Product home page Git Product logo

flights's Introduction

Airline Flights

The United States Department of Transportation has Flight Stats available through the Bureau of Transportation Statistics.

https://transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time

The fields we use are specified in detail further below, but here is a sample record:

"FL_DATE" "OP_CARRIER" "TAIL_NUM" "OP_CARRIER_FL_NUM" "ORIGIN" "DEST" "CRS_DEP_TIME" "DEP_TIME" "DEP_DELAY" "TAXI_OUT" "TAXI_IN" "CRS_ARR_TIME" "ARR_TIME" "ARR_DELAY" "CANCELLED" "CANCELLATION_CODE" "DIVERTED" "CRS_ELAPSED_TIME" "ACTUAL_ELAPSED_TIME" "AIR_TIME" "FLIGHTS" "DISTANCE" "CARRIER_DELAY" "WEATHER_DELAY" "NAS_DELAY" "SECURITY_DELAY" "LATE_AIRCRAFT_DELAY"
2017-01-01 "AA" "N787AA" "1" "JFK" "LAX" "0800" "0831" 31.00 25.00 26.00 "1142" "1209" 27.00 0.00 "" 0.00 402.00 398.00 347.00 1.00 2475.00 27.00 0.00 0.00 0.00 0.00

Objective:

  1. Ask questions that can be answered by the data.
  2. Answer the questions in Kibana.
  3. Find surprises in the data.

For example:

  • What's the busiest airport?
  • How many flights were there in 2017?
  • What was the most popular holiday to fly?
  • What aircraft made the most flights?
  • What airport has the most delays (or the least)?

Write down other questions you have so we can answer them with Elastic.

Ingest

Getting this data into Elastic can be accomplished using:

  • Logstash
  • Beats
  • Programming Language

At its core, data is ingested via the Document APIs. These are a set of RESTful APIs that all of the methods above use to ingest data. It's recommended you use whatever tool (or language) you are most comfortable with. Logstash & Beats provide a configuration-driven approach to ingesting data, while a programming langauge will give you more flexibility at the cost of verbosity. There are tradeoffs to each approach but the choice is yours.

Though Go is not part of the official Elasticsearch Clients supported by Elastic, there is a popular Elastic Go library that wraps the REST APIs. We will be using that library to ingest data.

Data Sources

To get the data used for this exercise, select these data fields from the download form linked to above:

  1. FlightDate
  2. IATA_CODE_Reporting_Airline
  3. Tail_Number
  4. Flight_Number_Reporting_Airline
  5. Origin
  6. Dest
  7. CRSDepTime (CRS Departure Time (local time: hhmm))
  8. DepTime (Actual Departure Time (local time: hhmm))
  9. DepDelay (Difference in minutes between scheduled and actual departure time. Early departures show negative numbers.)
  10. TaxiOut (Taxi Out Time, in Minutes)
  11. TaxiIn (Taxi In Time, in Minutes)
  12. CRSArrTime (CRS Arrival Time (local time: hhmm))
  13. ArrTime (Actual Arrival Time (local time: hhmm))
  14. ArrDelay (Difference in minutes between scheduled and actual arrival time. Early arrivals show negative numbers.)
  15. Cancelled (Cancelled Flight Indicator, 1=Yes, 0=No)
  16. CancellationCode (Specifies The Reason For Cancellation: "A","Carrier", "B","Weather", "C","National Air System", "D","Security")
  17. Diverted (Diverted Flight Indicator, 1=Yes, 0=No)
  18. CRSElapsedTime (CRS Elapsed Time of Flight, in Minutes)
  19. ActualElapsedTime (Elapsed Time of Flight, in Minutes)
  20. AirTime (Flight Time, in Minutes)
  21. Flights (Number of Flights)
  22. Distance (Distance between airports (miles))
  23. CarrierDelay (Carrier Delay, in Minutes)
  24. WeatherDelay (Weather Delay, in Minutes)
  25. NASDelay (National Air System Delay, in Minutes)
  26. SecurityDelay (Security Delay, in Minutes)
  27. LateAircraftDelay (Late Aircraft Delay, in Minutes)

Then select each month & year you want data for and click download. Unzip the file and rename it to "YEAR-MONTH.csv" (e.g., 2017-02.csv). Repeat this until you have all the months you want data for.

Download the Airport data to get each Airports latitude and longitude:

https://raw.githubusercontent.com/jpatokal/openflights/master/data/airports.dat

  • Rename that file to airports.csv.
  • Open vim to search and replace \" with nothing using: :%s/\\"//g
  • Add the following lines to the top of that file:

0,"Williston Basin International Airport","Williston","United States","XWA","KXWA",48.2608639,-103.7511389,2353,-6,"N","America/Chicago","airport","OurAirports" 0,"Kearney Regional Airport","Kearney","United States","EAR","KEAR",35.156111,-114.559444,2131,-6,"N","America/Chicago","airport","OurAirports" 0,"Laughlin/Bullhead International Airport","Bullhead City","United States","IFP","KIFP",35.156111,-114.559444,707,-7,"N","America/Phoenix","airport","OurAirports" 0,"Stillwater Regional Airport","Stillwater","United States","SWO","KSWO",36.161111,-97.085556,1295,-6,"A","America/Chicago","airport","OurAirports" 0,"Concord Regional Airport","Concord","United States","USA","KJQF",35.387778,-80.709167,705,-5,"A","America/New_York","airport","OurAirports" 0,"Branson Airport","Branson","United States","BKG","KBBG",36.531944,-93.200556,1302,-6,"A","America/Chicago","airport","OurAirports"

Download the Airline data to get each Airline's full name:

https://raw.githubusercontent.com/jpatokal/openflights/master/data/airlines.dat

Rename that file to airlines.csv.

Put all these data files in the directory ~/data/flights.

flights's People

Contributors

gose avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.