Abstract:
At present, Neural Machine Translation (NMT) methods take sentences as the unit to input, the context information cannot be effectively utilized in the translation process, which affects the performance of machine translation. In order to solve this problem, this study proposes a document-level neural machine translation method that integrates topic information. In this method, firstly, it takes the source and the context sentence into the source encoder and the context encoder independently, and then uses the attention mechanism to map the outputs of the two encoders into the context representation. The context representation is combined with the source encoder output to obtain a fusion representation through a gating mechanism. At the same time, the source sentence after word embedding is mapped to the topic representation through the topic encoder based on Bi-GRU and Convolutional Neural Networks. Finally, the fusion representation and topic representation are feed into decoder through two serial attention mechanisms, respectively. Experiments show that this method can improve the performance of document-level neural machine translation, and this method achieved by up to 0.55 percentage points in BLEU compared to baseline system.