Git Product home page Git Product logo

bert-as-language-model's Introduction

BERT as Language Model

For a sentence S = w_1, w_2,..., w_k , we have

p(S) = \prod_{i=1}^{k} p(w_i | context)

In traditional language model, such as RNN, context = w_1, ..., w_{i-1} ,

p(S) = \prod_{i=1}^{k} p(w_i | w_1, ..., w_{i-1})

In bidirectional language model, it has larger context, context = w_1, ..., w_{i-1},w_{i+1},...,w_k.

In this implementation, we simply adopt the following approximation,

p(S) \approx \prod_{i=1}^{k} p(w_i | w_1, ..., w_{i-1},w_{i+1}, ...,w_k).

test-case

more cases: δΈ­ζ–‡

export BERT_BASE_DIR=model/uncased_L-12_H-768_A-12
export INPUT_FILE=data/lm/test.en.tsv
python run_lm_predict.py \
  --input_file=$INPUT_FILE \
  --vocab_file=$BERT_BASE_DIR/vocab.txt \
  --bert_config_file=$BERT_BASE_DIR/bert_config.json \
  --init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt \
  --max_seq_length=128 \
  --output_dir=/tmp/lm_output/

for the following test case

$ cat data/lm/test.en.tsv 
there is a book on the desk
there is a plane on the desk
there is a book in the desk

$ cat /tmp/lm/output/test_result.json

output:

# prob: probability
# ppl:  perplexity
[
  {
    "tokens": [
      {
        "token": "there",
        "prob": 0.9988962411880493
      },
      {
        "token": "is",
        "prob": 0.013578361831605434
      },
      {
        "token": "a",
        "prob": 0.9420605897903442
      },
      {
        "token": "book",
        "prob": 0.07452250272035599
      },
      {
        "token": "on",
        "prob": 0.9607976675033569
      },
      {
        "token": "the",
        "prob": 0.4983428418636322
      },
      {
        "token": "desk",
        "prob": 4.040586190967588e-06
      }
    ],
    "ppl": 17.69329728285426
  },
  {
    "tokens": [
      {
        "token": "there",
        "prob": 0.996775209903717
      },
      {
        "token": "is",
        "prob": 0.03194097802042961
      },
      {
        "token": "a",
        "prob": 0.8877727389335632
      },
      {
        "token": "plane",
        "prob": 3.4907534427475184e-05   # low probability
      },
      {
        "token": "on",
        "prob": 0.1902322769165039
      },
      {
        "token": "the",
        "prob": 0.5981084704399109
      },
      {
        "token": "desk",
        "prob": 3.3164762953674654e-06
      }
    ],
    "ppl": 59.646456254851806
  },
  {
    "tokens": [
      {
        "token": "there",
        "prob": 0.9969795942306519
      },
      {
        "token": "is",
        "prob": 0.03379646688699722
      },
      {
        "token": "a",
        "prob": 0.9095568060874939
      },
      {
        "token": "book",
        "prob": 0.013939591124653816
      },
      {
        "token": "in",
        "prob": 0.000823647016659379  # low probability
      },
      {
        "token": "the",
        "prob": 0.5844194293022156
      },
      {
        "token": "desk",
        "prob": 3.3361218356731115e-06
      }
    ],
    "ppl": 54.65941516205144
  }
]

bert-as-language-model's People

Contributors

jacobdevlin-google avatar xu-song avatar cbockman avatar abhishekraok avatar bogdandidenko avatar eric-haibin-lin avatar 0xflotus avatar aijunbai avatar ammarasmro avatar bitmindlab avatar craigcitro avatar georgefeng avatar rodgzilla avatar qwfy avatar jasonjpu avatar pengli09 avatar stefan-it avatar zhaoyongke avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.