Since 2008, guests and hosts have used Airbnb to travel in a more unique, personalized way. As part of the Airbnb Inside initiative, this dataset describes the listing activity of homestays in Boston, MA.
I've chosen this dataset to understand which areas in Boston are better for tourists and what makes a highly rated Airbnb host. Using Pyhon (numpy, pandas, sklearn, and seaborn) I produced results for these questions and published them in a Medium article This data will make you rethink how you Airbnb in Boston.
The following Airbnb activity is included in this Boston dataset:
- Listings, including full descriptions and average review score
- Reviews, including unique id for each reviewer and detailed comments
- Calendar, including listing id and the price and availability for that day
Focuses on understanding the project objectives and requirements from a business perspective, and then converting this knowledge into a data mining problem definition and a preliminary plan.
I chose the Boston Airbnb Dataset to understand:
What areas are the most expensive and busy areas to stay in Boston?
- Are certain neighborhoods more expensive than others, and by how much?
- Are certain neighborhoods busier at different times of the year?
What makes a great host?
- How do comments vary depending on the rating quartiles?
- What rating features make a super host?
- What features correlate with higher ratings?
This dataset is part of Airbnb Inside, and the original source can be found here.