one-weird-trick's Introduction

Transforming and Visualizing Word Embeddings

This codebase contains a set of simple postprocessing transformations that improve the performance of word embeddings. Prior work has shown that mean subtraction and removal of early principal components can enhance performance on lexical similarity tasks. We further demonstrate that, simply by performing these transformations only on a strategic subset of the vocabulary, we can consistently achieve even further gains (up to 20% overall), while consuming less compute and memory resources. Not only does this behavior offer insights into the linguistic properties of these word representations, but the gains are considerable and hold on both static word embeddings (word2vec and GloVe) and contextual word embeddings (BERT and GPT-2) across a broad range of lexical similarity tasks.

Recommend Projects

google / one-weird-trick Goto Github PK

one-weird-trick's Introduction

Transforming and Visualizing Word Embeddings

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent