Learned in Translation: Contextualized Word Vectors
Paper-pdf
Authors: Bryan McCann, James Bradbury, Caiming Xiong, Richard Socher (Salesforce)
Hypothesis
Adding context vectors learned from an MT task can improve performance over using only unsupervised word and charachter vectors on a variety of NLP tasks.
Model
- Use an MT LSTM English-German model to encode the words and get a contextual representation.
- The hidden states of the encoder in the MT model are basically called the CoVe (context ) vectors.
Classification:
Use a biAttentive classification network along with CoVe vectors.