1 - Sequence to Sequence Learning with Neural Networks
In this series we'll be building a machine learning model to go from once sequence to another, using PyTorch and TorchText. This will be done on German to English translations, but the models can be applied to any problem that involves going from one sequence to another, such as summarization, i.e. going from a sequence to a shorter sequence in the same language.
在這一系列冶伞,我們將使用Pytorch和TorchText來構(gòu)建序列機(jī)器學(xué)習(xí)模型。這將完成德語和英語的翻譯,模型可以應(yīng)用于涉及從一個(gè)序列到另一個(gè)序列的任何問題椭更,例如摘要载荔,即從同一語言的序列到較短的序列脂崔。
In this first notebook, we'll start simple to understand the general concepts by implementing the model from the Sequence to Sequence Learning with Neural Networks paper.
Introduction
The most common sequence-to-sequence (seq2seq) models are encoder-decoder models, which commonly use a recurrent neural network (RNN) to encode the source (input) sentence into a single vector. In this notebook, we'll refer to this single vector as a context vector. We can think of the context vector as being an abstract representation of the entire input sentence. This vector is then decoded by a second RNN which learns to output the target (output) sentence by generating it one word at a time.
最常見的序列對序列的模型即編碼解碼模型及舍,通常使用遞歸神經(jīng)網(wǎng)絡(luò)(RNN)將源(輸入)語句編碼為單個(gè)向量趋翻。我們將這個(gè)向量稱為上下文向量醉蚁。我們可以將上下文向量視為整個(gè)輸入句子的抽象表示燃辖。 然后,該向量由第二個(gè)RNN解碼网棍,該第二個(gè)RNN通過一次生成一個(gè)單詞來學(xué)習(xí)輸出目標(biāo)(輸出)語句黔龟。