Represent words with embeddings
The Bag of words model operates on high-dimentional BoW vectors and does not express any semantic similarity between words. Thus we’ll turn to the embedding model.
The embedding layer would take a word as an input, and produce an output vector of specified embedding_size.
In embedding bag model,
- convert each word in our text into corresponding embedding (a vector with length embedding_size)
- compute some aggregate function over all those embeddings, such as sum, average or max.
Note: the input is not the literal word but the word number (its index in the vocab)
An Example
