Authors: Bryan McCann, James Bradbury, Caiming Xiong, Richard Socher (Salesforce)
Adding context vectors learned from an MT task can improve performance over using only unsupervised word and charachter vectors on a variety of NLP tasks.
- Use an MT LSTM English-German model to encode the words and get a contextual representation.
- The hidden states of the encoder in the MT model are basically called the CoVe (context ) vectors.
Use a biAttentive classification network along with CoVe vectors.