Git Product home page Git Product logo

beta_distribution_adprediction's Introduction

Beta Distribution for Ad Prediction -- a typical thompson bandit solution

overall introduction

This program is a solution for bandit problem by using Thompson method.

The Multi-Armed Bandit problem is the simplest setting of reinforcement learning. Suppose that a gambler faces a row of slot machines (bandits) on a casino. Each one of the K machines has a probability θk of providing a reward to the player. Thus, the player has to decide which machines to play, how many times to play each machine and in which order to play them, in order to maximize his long-term cumulative reward. At each round, we receive a binary reward, taken from an Bernoulli experiment with parameter θk. Thus, at each round, each bandit behaves like a random variable Yk∼Bernoulli(θk). This version of the Multi-Armed Bandit is also called the Binomial bandit.

In ad prediction, bandit is a typical module for choosing an ad with the highest probability to be clicked.

A python beta distribution is adopted for establishing the parameter sets for each customer group. Once we have the customer group id, we can accordingly recommend the most highly-probable ad for the user.

Thompson multi-arm bandit algorithm explanation

A Thompson tutorial provided by Stanford University has below conclusion. For K actions, mean rewards $\theta = \left(\theta_a, \dots, \theta_K\right)$ is unknown. A reward $r_1 \in [0, 1]$ is generated with success probability $$\mathbb P\left(r_1 = 1 | x_1, \theta\right) = \theta_{x_1}$$ After observing $r_1$, system will update its parameters accordingly.

Algorithm

for t = 1, 2, $\dots$ do

# sample model:
for k = 1, $\dots$ , K do

sample $\hat \theta_k \sim beta \left(\alpha_k, \beta_k\right)$

end for

# select and apply action:
$x_t \leftarrow argmax_k \hat \theta_k$

# update distribution:
$(\alpha_{x_t}, \beta_{x_t})\leftarrow(\alpha_{x_t} + r_t, \beta_{x_t} + 1 - r_t)$

end for

Input data explanation

In the data folder, two sample csv are provided.

  1. Segments.csv contains
  • user id
  • its segment id.
  1. Summary.csv contains
  • user id
  • ad id
  • ad total number of display
  • clicks of the ad

beta_distribution_adprediction's People

Contributors

guilongaaron avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.