pppsdavid / orie-4741-project Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 0.0 2.53 MB

Jupyter Notebook 99.48% Python 0.52%

orie-4741-project's People

Contributors

Stargazers

Watchers

orie-4741-project's Issues

Peer Review

The project is about predicting flight delays. The team takes into consideration different factors such as airport’s traffic volume, airline policies, weather conditions, etc., from two different datasets, US Weather Events and Airline reporting carrier on-time performance. The objective of the project is to build a model that will accurately predict which flights will be delayed in order to help airlines and travelers to prepare for the future.

Things I like:

I like how they are using two different datasets – one for the weather conditions and another for flight information.
I like that they have provided some background information to back up the importance of predicting flight delays (ex., the US economy).
Because the are many different features (some related to flight information, some to weather conditions), the team will be able to understand better which variables are most important for predicting flight delays.

Areas for Improvement:

How will you compare the two models? As I understood, the first model will be based on airline name, flight start, destination, etc., while the second model will consider weather conditions.
Will the model take into consideration cancelled flights or only delayed flights?
How do you plan on cleaning the data, dealing with missing values?

Project Feedback

The submitted proposal aims to aggregate and interpret publicly available data about flight statistics by airline and date and weather events by date to train a model to estimate delay times for a given flight. Ultimately, the numbers would be served to customers or corporations, to provide better estimates and peace of mind for travelers' schedules. This is a convincing problem, with potentially billions of dollars per year on the line--though it is not clear whether the model would result in actual recuperation of the loss so much as better anticipation of when it may occur.

The proposal contains a good idea of the consumer's model of an airline, and aims to mirror that information in a novel two-stage model: an initial estimate simply aggregating the airline's statistics per destination and travel period, and an updated estimate with more detailed weather information. While there is no clear dependence between the models except as a point of comparison, there is more rigor in the proposal stemming from the distinct models predicting the same results. Finally, there is a clear and valuable statement of intent in the project, which will be invaluable as time moves on.

The project is likely to encounter challenges distinguishing between the two models. It may turn out that the simpler model has sufficient data for a reasonable estimate—to within an hour, perhaps—where the more complex model overfits the more complex dataset. Furthermore, regression is unlikely to be useful, because flight delays are "technically" unbounded, though bounded by cancellations in practice. Speaking of cancellations, is the model likely to predict a cancelled flight differently from an indefinitely delayed flight? Will it provide error bounds, or simply a result? Will the first model feed into the second?

While training chained models is decidedly more complex than training two separate models, the aggregated information in the first model's output may prove more useful than training an entirely new model, and reduce overfitting.

final peer review

Summary:
The project is about investigating factors behind flight delay problems. Their objective is to develop a model to predict the expected delay time for U.S. air traffic from 2016 to 2020. And they are combining two main datasets: Airline reporting carrier on-time performance dataset that contains flight information and US Weather Events dataset which contains weather warnings issued by weather stations in different airports. And they choose to use random forests to be the final model to predict whether a flight will be delayed.

Things I liked:
I like the formatting and outline of the report because it is easy to read and follow. And I like how you divide each part into little sections with a lot of detail.
I also like the feature engineering part and how you analyze different features and provide some detailed explanation of what each of them represents and its values.
And I really like how you try different methods to improve the F1 scores and include a table to compare the performance of different algorithms you used.
Suggestions:
It would be better if you include both training and validation accuracy and how the accuracy changes with different sets of hyperparameters.
You can try to remove some of the less important features. It might help improve the model.
You may also try Light GBM. It is a fast, high-performance gradient boosting framework based on a decision tree algorithm, used for classification and many other machine learning tasks, which is also worth trying. And since the dataset is really large, XGBOOST would take a long time to train. Light GBM has Faster training speed and higher efficiency.

Peer Review

The project is about investigating factors behind flight delay problems. Their objective is to develop a model to predict the expected delay time for U.S. air traffic. And they are using two main datasets: Airline reporting carrier on-time performance dataset that contains flight information and US Weather Events dataset which contains weather warnings issued by weather stations in different airports.

Three things I like:
I like the problem formulation. you seem to have a good idea and clear approach of how to tackle this problem by listing a few possible modeling approaches.
I like the significance of the project. Many people have experienced flight delays. Predicting which flight will have delays not only to benefit consumers but also those airline companies, and air traffic controllers since consumers can plan their trips accordingly by predicting the expected delay time of the planes, airlines could also optimize connecting flights and notify passengers if the flights are likely to be delayed.
I also like how the data used is a combination of two different datasets since there can be a lot of factors in causing flight delays.These two datasets would provide a sufficient amount of information to investigate delay trends and how weather may affect delay times.

Three things to improve upon:
There could be some missing values in the dataset, what do you plan to do to deal with them? It may be helpful to explain ideas in the proposal.
Are you going to combine the two datasets you chose? And how are you going to match them? This could be something to consider in the data cleaning process.
And how are you going to deal with overfitting or underfitting?

Final Review

What's it about?
The FlightDelayPrediction project is about investigating the delays of flights and the factors that may or may not influence it.

What data are they using?
They are using a flight delay dataset and a weather dataset.

What's their objective?
Their objective is to classify a flight as one that will be delayed or not given a few factors (airline, weather, time of day, time of year, etc.)

Three things I like:
I reviewed this project in the past, so I will reiterate the things I liked before first. I like the nature of the problem -- it feels more unique than a lot of the other projects (mine included) and also seems like one that would create a lot of value for airlines, airports, travellers, travel agencies, etc. I like that this group merged two datasets because weather is certainly a very important factor when it comes to flight delays, and even though their dataset didn’t originally contain that information, this was a creative solution to capture a lot of additional informational value.
I also really liked the formatting of the report. I said previously that I liked the flow of the report, but this felt like a separate thing altogether. The report was very clean, had clear sections and subsections, and made for a very easy read.
Lastly, I liked the transparency of this report. I didn’t feel like there was anything remotely misleading. Flaws were acknowledged and well-countered which makes for a much more convincing read than other reports that shy away from providing certain insights or information. Furthermore, the discussion on the limitations of the model was very realistic, and it paired well with the future analyses mentioned in the conclusion.

Three things to improve upon:
I reviewed this project for the midterm report and had two concrete suggestions, and one that was more wishy-washy so that I could get to three. Now, however, it’s difficult for me to write because the group really took my suggestions to heart and clarified the points that I had raised in the past (so much so that I forgot what I initially wrote and had to look back on them). With that said, I’ll try to find some more areas of improvement (although after my initial read through there was nothing too glaring).
One thing that did confuse me was within the feature engineering section, the following was stated: “Then for the arrival airport, there are too many values which may possibly lead to overfitting.Thus, we decided to use Dest (arrival airport) as destination category.”. To me, this seems like at first, the arrival airport will be discarded, then it will be used in the form of Dest, so perhaps some further clarification could help.
Next, also within your feature engineering, I saw that you decided to convert time of day into four categories. I understand why you wouldn't want to use a one-hot, but perhaps a longer discussion on why you were discouraged from using time as a continuous variable, or why you chose four categories instead of say 24.
Finally, I appreciate the transparency throughout your report, but then when looking at the feature importances of your random forest model, there seemed to be a bit of a difference in the scales of your features and their importances. Why is Year so high? Why are certain days of the week multiple times bigger than others? Perhaps something like a lasso regression could make these coefficients smaller and more interpretable. Or investigate the trend of cancellations per year within the report so that we can understand why something as large as the year plays such a role.

Final Review

I think that this is a really interesting topic to try to predict given its benefit for both the airlines and the passengers. I enjoyed the analysis that was done on the delay rates across both different airlines and different airports. It is clear that a lot of effort has been put in to try to understand the data: how it’s distributed, how much noise there is, etc. Additionally, the feature engineering techniques made a lot of sense and were thoroughly justified. I think some extra commentary could have been added about which features you expected to be the most important in predicting whether the flight will be delayed. I also think that the decision of using an F1 score is an important one given the imbalance of labels: I feel like this is an easy thing to overlook at times.
I do share similar concerns in the feature importance rank of the year feature. I was surprised by how much larger its importance was relative to everything else. I would be curious to see how some of these models would perform if this feature was removed, as well as what the feature importance ranking would look like after this change. I also think some visualizations might be good on how flight delay rates have changed over time.
The results of the models seem quite promising given the difficulty of the task. I also thought that the discussion of fairness and whether the model produces a WMD was thorough and sensible. I can see why this was posed as a classification problem, but I’d be curious to see how it would do if instead you predicted the amount of time the flight would be delayed by, or maybe another classification problem about whether a flight would get cancelled. Overall, I think this is really good work and that the report provides a lot of interesting discussion.

Peer Review 3

This project is about predicting whether a given flight will be delayed. The data being used come from both a Kaggle dataset and an IBM airline performance dataset. The objective is to build a model to predict flight delays in order to help customers and airlines over time.

Things I like

I like that significant background information into the importance of the prediction of flight delays was provided, so that the reader is made aware of the relevant concerns addressed by the project
It was nice to see that multiple different datasets were being combined and used in different contexts but for the same purpose -- using one dataset for conditions related to flight information and another for the takeoff conditions seems like a good idea
I like how a plan of action was effectively provided for the use of the results of the analysis (that ATC would be able to plan ahead for flights that tended to be delayed)

Areas for improvement

It would be nice to have some examples of commonly-delayed flights so that the objective becomes more understandable and relatable; maybe flights around Christmas break tend to be delayed more? Vacation flight delays are something the reader is likely to understand and be able to commiserate with
This is kind of nitpicky, but have flight delays really caused that much trouble for the economy? Yes, $31-40 billion is a lot of money, but in what way have these "troubles" been caused? What does missing a flight really impact? (This is another thing that will make the reader/employer more easily understand the need for new research here.)
You say that "...In the first dataset, we can investigate delay trends that differ by airport/ dates/ time in day/ airline, and in the second dataset, we expect to see how severe weather warnings at start and destination airports affect delay times." Is there a specific way that you feel you can combine these trends into a single figure (i.e., the expected delay time that you claim to be predicting for a given flight)?

Final Report Peer Review (kxc4)

This project is about predicting which commercial airline flights will experience delayed departures. It uses data from airlines on flights combined with the US Weather Events dataset which contains information on weather warnings. The goal is to predict which flights will be delayed ahead of time in order to save billions of dollars annually.

I like the description of findings from the preliminary data analysis, as well as your descriptions for features you added to the dataset and the feature encodings you chose. In the data analysis section, I think it would have been helpful to have a legend for the airline keys used in the middle graph on page 3. I like your brief description of XG Boost, as well as your decision to use it as it tends to be a very well performing model for a variety of supervised learning tasks. I like your choice of using equality of opportunity as the fairness measure to be considered and your explanation for why you chose it makes sense in the context of this problem.

One thing not mentioned in discussing the weather data was when the data was recorded. This may be an important consideration since weather patterns changes rapidly and if the goal is for the model to predict whether a flight will be delayed a certain amount of time before it is scheduled to depart, it would be important to only include weather predictions made that far in advance of each flight’s scheduled departure. I think it would have been beneficial to include more reasoning for why you chose to use weighted F-1 score as opposed to another measure like weighted accuracy in the context of this problem, as well as why you chose to use a weighted measure other than just that the dataset is unbalanced. It may have been interesting to consider the costs of false positives and false negatives in this section. Applying a model like Control Burn may have yielded interesting results in analyzing feature importance since, with so many features in your dataset, it is likely that many are correlated in some way which could affect their perceived ‘importance’ in your random forest model.

Overall I thought this was a very well done project. Your models, analysis, and visualizations made it easy to understand and interesting to read. This would definitely be an interesting model to deploy to assist airlines foresee flight delays and adjust for them accordingly. Great job!

Peer Review

This project aims to take various different factors into consideration (airport traffic volume, differences between airline policies, weather conditions, etc) to predict flight delays when traveling. If implemented and accurate in its predictions, such a model could save the tens of billions of dollars that are lost each year due to delays, and make the traveling experience much easier so flyers could have a much clearer understanding of how long their waits will take.

Things I Like:
I like the idea of building two different models to take information from the flight information and from the weather information (and other information from outside the airport and the plane).
The clear objective. Knowing exactly what they want to find makes the implementation of the model easier to make.
Having so many different variables can make it easier to understand which variables have the highest effect on the resulting predictions, or are highly correlated with each other, etc.

Areas for Improvement:
How are you going to properly match the data from the two different models? Understanding how you plan to do this was not very clear (though I understand that you might not have a concrete plan here yet, so this is just something to be careful of).
What kinds of errors in the predictions would still be considered accurate? Would being off by a margin of 15 minutes be considered “on time?” What about an hour? Defining a window for the take off times after the delay could be helpful here.
How do you plan to deal with null values in columns?
How will you counteract any overfitting or underfitting?

Midterm Peer Review

Summary:
This project aims to predict whether a flight will be delayed by analyzing records from the top 10 largest airports in the United States between 2016 to 2020. Since the passenger only get information about departure delay and more accurate estimate of weather a few hours before the flight, the group decided to consider them as short term information and trained 2 sets of models (one include this information - short term forecast model, while the other do not - long term forecast model)

Things I liked:
It is a great idea to train two sets of models as the goal of the project is to help passengers make better travel plans. The long-term model aims to warm passengers days in advance while the short-term model will give more accurate predictions.
You have clearly explained how you choose certain features and dealt with missing values and gave reasonable explanations to support them.
It is great that they compared train and test errors and gave concrete next steps on how they will solve the problem of underfitting

Suggestions:
I would love to see a heat map to explore the correlation between features and the label.
I would like to see the coefficients from your logistic regression model and some interpretations of them. It would be interesting to see how adding the short-term features changes the weight vector.
I would also explore how the number of flights changes every year to see if there exists any seasonality.

Midterm peer review

What's it about?
This project is about examine which factor is the correlated to the delays of flights and understanding the reasons of flight delays.

What data are they using?
They used 2 datasets that contains flight delay data and corresponding weather data. The data set's time frame includes data from 2016-2020.

What's their objective?
Their objective is to classify a flight as one that will be delayed or not given a few factors (airline, weather, time of day, time of year, etc.)

Three things I like:
Firstly, this is a daily scene that many people can relate to, so I think this topic itself is very interesting and makes it easier to read. Having something that's related to my daily life especially since I used to travel a lot, flight delays are usual and I think this project can be made useful for airlines and airports as well as travel agencies.
I like that this group made the guideline of this report really clear and it's structured in the way that's easy to follow: for the description of the data, pre-examine the data to figure if there are null values to be dealt with, then pre-process the data to fill in the black columns / remove the rows.
I also like that the further improvement part especially that they not only mention what models or methods will be used but also why they are being used. This is a better way to engage how this project can be moved forward

Three things to improve upon:
I think the first thing is that maybe put for visualizations in the sections other than appendix, this is a tool that can be used so that people who are reading your report can better understand what is being said.
Secondly, the graphs have no numbers / no axis on it so it's hard for the readers to also gauge them. Put on titles and be more specific on what those graphs represent and show the percentages / numbers.
Third is that maybe be clearer on what you would exactly do as the future plans, there are lots that the group is planning on doing, however I don't know if those are extremely feasible as there are other aspects to work on as well. So I think maybe look at the workload and examine how you can prioritize the sequence of work. But in general a really great job!

Final Review

Summary:
This project aims to predict whether a flight will be delayed by analyzing records from the top 10 largest airports in the United States between 2016 to 2020. The group only uses information available prior to departure so that the model could easily be deployed to predict flight delays before departure. This is a typical supervised learning setup.

Things I liked:

The feature engineering part is great. The one-hot encodings and data cleaning procedures were well thought and implemented.
It is great that you only used features available before departure. This makes the model deployable in real-world settings.
Your train-validation-test split is great. It definitely helps with model selection and avoids bias toward certain models.

Suggestions:

I am a bit concerned about the high feature importance of Year. While you did demonstrate that this does not affect overall model performance, why will it be that Year is the most significant feature for your model?
Your exploratory data analysis could include more graphics to show the correlation between different variables. That would make the project more intuitive and understandable.
It would be better if you could describe your hyperparameter tuning process in greater detail. For example, you could include how the accuracies change when you change a specific hyperparameter.

Midterm Peer Review

What's it about?
The FlightDelayPrediction project is about investigating the delays of flights and the factors that may or may not influence it.

What data are they using?
They are using a flight delay dataset and a weather dataset.

What's their objective?
Their objective is to classify a flight as one that will be delayed or not given a few factors (airline, weather, time of day, time of year, etc.)

Three things I like:

I like the nature of the problem -- it feels more unique than a lot of the other projects (mine included) and also seems like one that would create a lot of value for airlines, airports, travelers, travel agencies, etc.
I like that this group merged two datasets because weather is certainly a very important factor when it comes to flight delays, and even though their dataset didn’t originally contain that information, this was a creative solution to capture a lot of additional informational value.
Finally, I liked the flow of their report and its organization. Each section went in the order that I anticipated, and if I had a question about something they brought up, they usually immediately addressed it or referenced where it could be found. This was certainly a very readable and comprehensible report which is quite helpful.

Three things to improve upon:

My first thought for improvement is to explain the difference between Long Term and Short Term features. They did a brief explanation and that confused me a bit more because I think I may have labeled them as the opposite, but I think I am just confused so a bit more clarification on what your two models are would be helpful.
A second thought for improvement would be to either post this in a different format or modify your visualizations. I personally use GitHub on their dark theme, and since the graphs didn’t have a declared background, they blended in with the site and at first I didn’t see the titles, scales, labels, etc. so perhaps a new method of exporting either the report or graphs would solve that issue.
Lastly, there seem to be a lot of plans for future next steps. While I think all their steps seem appropriate, my report received feedback to make sure that our report stays focused even if we explore these other steps, so I would advise to either trim some of these steps, or be careful to link their reasoning and explanations together so that readers can follow one (or two) main streams of thought and don’t get inundated. P.S. your project was really solid so I added this for a third critique but it’s definitely more of a stretch.

pppsdavid / orie-4741-project Goto Github PK

orie-4741-project's People

Contributors

Stargazers

Watchers

orie-4741-project's Issues

Recommend Projects

Recommend Topics

Recommend Org