GloVe

GloVe is a family of NLP machine learning approaches used to learn word embeddings. Unlike word2vec (which is primarily local), GloVe uses global information in its embeddings, by:

Computing co-occurrence frequency counts for each word (i.e., frequency of word pairs) across the entire corpus. This is represented as a matrix, where each element $X_{ij}$ denotes the number of times a word $i$ appears in the context of a word $j$ .
Optimisation: inner product of word vectors should be a good predictor of co-occurrence frequency.

GloVe embeddings do encode word relationships quite well! But they do have biased relationships learned from the data it’s trained on.

In code

torchtext gives us the ability to load pre-trained GloVe embeddings. The below loads a 6 billion parameter embedding trained on the Wikipedia corpus as of 2014.

glove = torchtext.vocab.GloVe(name='6B', dim=50)

jszhn

Recent Notes

Accounting method

Adjugate matrix

Algorithm

Algorithmic analysis

Alma Linux

GloVe

In code

Graph View

Backlinks