Git Product home page Git Product logo

airbnb-new-user-booking's Introduction

Airbnb-New-User-Booking

Summary

In this project, we intend to predict the booking destination of a newly onboarded user on Airbnb platform. To evaluate the model's performance, we used NDCG as the metric to improve the degree of relevance and the ranking of the predictions we make. With this, Airbnb could potentially

  • Provide personalized experience
  • Better forecast demand

Dataset

The problem statement and dataset is derived from Kaggle's - Airbnb New User Bookings Challenge. The predictor variables include information about the users (user_id, age, gender etc.) and preliminary session data (actions, action types, session time etc.). Our objective would be to predict the country (dependent variable) that the user is most likely to visit. It is to be noted that only a limited set of users have associated session data and therefore, we merge (inner join) the datasets and proceed for modelling with about 5.5 mil session observations for close to 73k users.

Methodology

In order to produce the results, we performs data preparation, data preprocessing, feature engineering, model building, evaluation and yperparameter tuning. Some amount of EDA was done to understand the dataset which can be found here.

Some challenges in this dataset is handling the imbalanced dataset and limited information available.

Peek into feature engineering and selection that improved the results to a great extent

Features that indicated higher likelihood of even making a booking in the first place

Features that indicated lower likelihood of making a booking

Modelling

The models we try out are as follows: Multinomial regression - using Softmax function and L2 regularization applied to help with classifying our target variables beyond the two categories where we apply logistic regression.

Bernoulli Naive Bayes - Bernoulli Naïve Bayes is well suited for discrete data with binary features which was the case after we completed feature engineering.

Decision Trees - highly predictive due to their capability of mapping non-linear relationships well. Results are also easily interpretable within the business context.

XGboost - Allows us to leverage its regularization technique (using both L1 and L2), sparsity awareness (robust learning from missing values) and in-built cross validation.

Consequently, Xboost gave the best performance of a NDCG score of 88.323.

For further improvements

  • Airbnb can consider data on detailed user demographics, as well as sessions' data (e.g., session time and data, search queries, etc.)
  • Work with relevant stakeholders to further refine feature selection.
  • We can cansider Novelty as a metric for recommending new travel destinations to users

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.