Git Product home page Git Product logo

cornell_orie5741_project's Introduction

Cornell_ORIE5741_Project

Project name: How to find a well worth airbnb room in New York?

Project member: Boyuan Cui (bc594)/Sicheng Zhao (sz629)

Presentation Link: https://youtu.be/Atmz86UfuFQ

cornell_orie5741_project's People

Contributors

sichengzhao avatar bc36 avatar

Watchers

 avatar

cornell_orie5741_project's Issues

Midterm Report Peer Review

This project looks at airbnb prices in Manhattan and seeks to find out what features have the most to do with overall price and then fit models in order to predict future prices. The midterm report is very straightforward and written so it is very well understood. I think the progression of the data analysis as the paper progresses fit the project goals well.

For fear of the reports end product being too narrowly focused, one suggestion would be to see if these features are just as important and if the end model works on data from another city, possibly another metropolitan or a less urban environment. I think this might have some interesting results seeing if there are patterns in prices regardless of the city the model is trained on.

Final peer review

The project is about predicting which features can be accurately used to predict the Airbnb prices in New York. This can in turn be used by a host of a new listing to come up with a price that is profitable to them as well as is affordable to their guests. They have used a dataset which has ample number of data points for price and location as well as the calendar dataset which contains the calendar information of the airbnb listings.
Things I like about the project:

  1. The group has analysed the price change over calendar days - price on weekdays vs weekends as well as price in various months which makes sense.
  2. The project report and the presentation are easy to read and understand with a lot of visualisations (esp. feature importance since they consider non linear models) and tables to facilitate easy understanding.
  3. They have also explored the sensitivity of the model. The model has been able to predict prices within $11 range.

Areas of improvement:

  1. The location features should be given more importance as mentioned in the report as well this is because an airbnb in Queens will have different pricing as compared to an airbnb in Manhattan.
  2. They can also introduce a feature for how urban or rural a location is and try to see if it impacts their prediction. This model can then be used to generalise and run on different cities/areas as well.
  3. The presentation could have been done by both the group members.
    Overall a very nice project

Final Report Peer Review -- fw249

This project is to predict price of Airbnb rooms based on some characteristics, such as bedrooms, beds, and so on. There are 2 datasets used: one is the listing data containing 96 features containing information of characteristics of each room; another is the calendar dataset including 7 features mainly containing information of prices. A conclusion is to answer a question: how can characteristics be used to predict the price of an Airbnb room?

There is something that I like about the report and project. Firstly, there were lots of diagrams used to demonstrate each exploratory process in the exploratory data analysis section. The visualization assists reading and understanding the arguments addressed in the report.
Secondly, the logic of choosing models is clearly explained. This leads the reader to better understand reasons of choosing each model step-by-step. Thirdly, in every section, the processes and techniques were applied and explained to two datasets separately. It’s better understood the logics and reasons behind each applied technique in a context that is specifically to characteristics of each dataset.

There is also something to improve. Firstly, although the exploration of the data was well conducted and explained, the connections of those actions to the processes of the entire modeling project was not clearly indicated. Those actions can be good recourses in feature selections and coefficients in modelling. Secondly, more numerical explanations in correlation may be included. It’s not clear how irrelevant was defined. Thirdly, a process of checking the linearity between the features and the dependent variable could be conducted to improve the feature selection.

Peer Review for Midterm Report

What's it about?

The project aims at developing regression models that can predict the prices of airbnb rooms in NYC based on the characteristics of the room. They have two datasets: listing data and calendar data. Their listing dataset has 29142 rows and 96 features. Their calendar dataset has 13469123 rows and 7 features. They want to use their models to help airbnb owners better price their rooms.

What do you like about the report?

First of all, the objective of the project is very clear and well-defined. Data cleaning was very well done. They explained why they did certain processing to the dataset. Also, the models they proposed to fit the data seem reasonable.

What concerns you?

Clearly, the price of the listings vary by the time of the year. But the report did not discuss how to incorporate seasonality into consideration. The report didn't come up with preliminary models and show their performance on the training, validation and test set.

Do you think you could use the results of this study?

Since this is a huge dataset and the scope of the project is limited to NYC, I think the models generated in this project can generalize well into reality.

What other aspects of the question do you think the group should consider?

I think overall, they have done a great job. They should probably try to incorporate time as a factor into their models.

final report peer review

The project is about helping the Airbnb users and the hotel system to set the price of a certain room based on different characteristics through the prediction of the price, and the project is focused on the New York Airbnb data. There are two datasets included, listing data about the room and the calendar dataset about the price.
The paper is well structured and put a lot of effort into the data exploration part with sufficient and efficient data visualization which provided a clear and informative overview of the dataset and the business problem. Also, I like that the paper didn’t just discuss the model separately, it has a section especially for model comparison which made it explicit for the readers to understand. Besides, I like that they continue to make improvements after they got good models by feature correlation check and the feature influence check.
For the shortages, firstly, it might be better to also include the outlier detection in the data exploration stage instead of just checking the missing value. The two datasets all have very large row numbers, and the modeling might be more precise after the outlier detection. Besides, before the first modeling process, it might be better to select the features first based on the correlation of the features to prevent the high collinearity problem.
Overall, this is a great project!

Final Peer Review

This group is doing to project on finding a convenient way for the new Airbnb host to predict the price of his or her listing. The team is analyzing features of an Airbnb room and decide on the price of the room. They build a model to predict the price of the Airbnb room based on bedrooms, location and house type.

This is an interesting topic and the team introduced the dataset clearly. The report is also presented in a clear way and it is easy to read. The team has tried both regression and classification method and build several models such as linear regression and decision trees. They also stated their conclusion that the Entire room/apt has the greatest positive effect on the prediction. What could be improved is that the team could explain the decision tree with more details and give a better explanation for the model comparison. The project looks great overall.

Peer Review

This project aims to assist potential short-term renters by identifying Airbnb listings that are priced correctly for their quality or amenities. The team has identified three data sets, two of which have data for individual listings, which will presumably be used for making price predictions, and one with summary statistics, which they plan to use for visualizations. The project will help consumers by finding relations between prices and features, so that consumers will know which accommodations to rent, using linear and nonlinear models.
I like that there is a clear purpose and conclusions could be very useful in such a variable market. I also like that the group found multiple datasets that may help them pursue their goals, and are also thinking about data visualizations. Additionally, I like how the project is specific to New York City, as it is a major tourist destination, yet still focused enough that geographical data won’t also need to be considered.
Areas for improvement:
(1) I would like to see more specifics about the types of methods that will be used in the project, such as time series techniques if the calendar data is included. (2) Additionally, some consideration on what widespread usage of the results might entail, would this mean that the market becomes static as prices are set at historical predicted averages? (3) Finally, some of the wording could be worked on to make a little more sense, and the second-to-last sentence could be removed, as it is a generally accepted fact in data science but doesn’t hold much credibility without a model.

Proposal Peer Review

This team is proposing using data of airbnb listings in New York City, both with detailing information per listing, and calendar information on when the listings are being occupied to estimate the most efficient price for a listing. The use case of this model would either be for the host of a listing to price their listing, or for renters to find the most effective listing given their budget. This model would be constrained to New York City listings only.

Things I like

  1. This project seems to be feasible. You would have the tools to produce your desired result as stated in the proposal.
  2. It is a good idea to limit the scope of the project to only one city or area. It constrains the problems and prevents a lot of issues that applying this model to multiple cities or countries with different housing markets and economies.
  3. This project seems to be very useful. Anyone cares about getting the best value for their money when traveling, so there would be a lot of demand for this model.

Areas of improvement:

  1. You mention in the introduction that the only way for airbnb hosts to price their listings is by looking at comparable listings and pricing accordingly. Yet isn't this what your model would be doing? You are only looking at airbnb listing prices for a given area and will price according to similar qualities, but just in a more comprehensive way? What if the prices of hotel rooms decrease become more competitive, does this model capture this? What about inflation, or changes in legislation?
  2. This model assumes that there is only one efficient price for a listing, but fails to capture the hosts' objective. Would a new host who is eager to build their reputation price their listing the same as a host who is more experienced and has an established reputation? Does a host who lives in their apartment part time want to price their apartment the same way as if they weren't living there? I imagine they would be willing to price a bit higher, risking to lose potential customers in order to get a more tame crowd.
  3. The proposal switches perspective from that of the renter and the host very often, and they each have very different goals. The renter doesn't want an efficiently priced room, they actually want an inefficiently priced room in their favor. I mentioned the goals of the host in the previous point. I would be very clear about which side you want to represent on this project. Also, "well worth" doesn't work in this context and is vague.

Midterm Report Review

The objective of this project to figure out a way to assess the fairness of pricing of Airbnb apartment in NYC. The team have obtained appropriate data and make reasonable transformations. The report is beautifully written, and they have some awesome visualizations.
One suggestion I have is that the groups is working with many features, as they have stated as their future plan, some features are going to be more valuable to them than some other ones. I have already done operations like PCA analysis in the Airbnb homework. Considering that fact, it would be harder for this group to come up with something difference and may be even work better.

Final report peer review

This project is to help the Airbnb users and the hotel system to have the best price of a certain room through the prediction of the price. Two datasets are included, one for the room and one for the price the price.

Things that I like:
format is decent and very clear to read
Graphs are vivid and helpful
Develop each algorithm very clearly, results are sound.

Things may need to improve:
An allover graph/table compare each algorithm could be easier for the reader to understand.
Could discuss feature selection before applying algorithms

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.