LSTM原理缴淋、源碼准给、Demo及習題

全面整理LSTM相關原理泄朴，源碼，以及開發(fā)demo露氮，設計習題祖灰。如轉載請注明轉載出處。

LSTM 框架

lstm.png

lstm 由3個門和一個當前細胞輸出值也就是 $\tilde{C}_t$ 來控制輸入

遺忘門 - 表示 $h_{t-1}$ 有多少被遺忘. $f_t = \sigma (W_{if}x_t + b_if + W_{hf}h_{(t-1)} + b_hf) = \sigma (W_{f}[x_t,h_{(t-1)}] + b_f)$
輸入門 - 表示當前時刻有多少被保存下來畔规。 $i_t = \sigma (W_{ii}x_t + b_ii + W_{hi}h_{(t-1)} + b_hi) = \sigma (W_{i}[x_t,h_{(t-1)}] + b_i)$
輸出門 - 控制多少信息被輸出到隱層局扶。 $o_t = \sigma (W_{io}x_t + b_io + W_{ho}h_{(t-1)} + b_ho) = \sigma (W_{o}[x_t,h_{(t-1)}] + b_o)$
當前Cell輸出 $\tilde{C}_t$ 用 $g_t 表示$ - 當前t時刻的實際 cell結果, $\tilde{C}_t = g_t = tanh(W_{ig}x_t + b_ig + W_{hg}h_{(t-1)} + b_hg) = tanh(W_{g}[x_t,h_{(t-1)}] + b_g)$

通過3個門以及 $\tilde{C}_t = g_t$ 計算如下:

$c_t$ - 當前細胞輸出。 $c_t = f_{t}*c_{t-1} + i_{t}*g_t$
$h_t$ - 隱層輸出油讯。 $h_t = o_{t}*tanh(c_t)$

參數(shù)計算

一共3個門加一個 $g_t$ 详民，所以一共 4組參數(shù) $(W_f, b_f), (W_i, b_i), (W_o, b_o), (W_g, b_g)$ , $[x_t,h_{(t-1)}]$ size是 input_size+hidden_size, 因為 $W * [x_t,h_{(t-1)}]$ 輸出的維度是與 $h_t$ 一樣的也就是 hidden_size, 所以 $W$ 的維度是 (input_size+hidden_size, hidden_size). $b$ 的size就是 hdden_size. 所以總共的參數(shù)量就是:

 4 * ((input_size+hidden_size) * hidden_size + hidden_size)

GRU

gru.png

GRU由2個門以及一個隱層輸出值 $\tilde{h}_t$ 也叫做 $n_t$ 來控制最終的 $h_t$

$r_t$ - 重置門. $r_t = \sigma (W_{ir}x_t + b_ir + W_{hr}h_{(t-1)} + b_hr) = \sigma (W_r[x_t, h_{(t-1)}] + b_r)$
$z_t$ - 更新門. $z_t = \sigma (W_{iz}x_t + b_iz + W_{hz}h_{(t-1)} + b_hz) = \sigma (W_z[x_t, h_{(t-1)}] + b_z)$
$\tilde{h}_t$ - 隱層細胞輸出. $\tilde{h}_t = n_t = tanh(W_{in}x_t + b_in + r_t*(W_{hn}h_(t-1) + b_hn)) = tanh(W_{n}[x_t, r_t*h_{(t-1)}] + b_n)$
$h_t$ - 最后輸出. $h_t = (1-z_{t})*h_{t-1} + z_{t}*\tilde{h}_t = (1-z_{t})*h_{t-1} + z_{t}*n_t$

從上面來看延欠，GRU 實際上比LSTM 少了一組參數(shù)陌兑。在數(shù)據(jù)量較大的時候使用lstm，而數(shù)據(jù)量較少使用GRU. 同時GRU不像lstm由捎，有 $c_t, h_t$ 兩個輸出兔综，GRU只有一個 $h_t$

pytorch lstm 解析

$\begin{array}{ll} \\ i_t = \sigma(W_{ii} x_t + b_{ii} + W_{hi} h_{(t-1)} + b_{hi}) \\ f_t = \sigma(W_{if} x_t + b_{if} + W_{hf} h_{(t-1)} + b_{hf}) \\ g_t = \tanh(W_{ig} x_t + b_{ig} + W_{hg} h_{(t-1)} + b_{hg}) \\ o_t = \sigma(W_{io} x_t + b_{io} + W_{ho} h_{(t-1)} + b_{ho}) \\ c_t = f_t * c_{(t-1)} + i_t * g_t \\ h_t = o_t * \tanh(c_t) \\ \end{array}$

Parameters

input_size – The number of expected features in the input x
hidden_size – The number of features in the hidden state h
num_layers – Number of recurrent layers. E.g., setting num_layers=2 would mean stacking two LSTMs together to form a stacked LSTM, with the second LSTM taking in outputs of the first LSTM and computing the final results. Default: 1
bias – If False, then the layer does not use bias weights b_ih and b_hh. Default: True
batch_first – If True, then the input and output tensors are provided as (batch, seq, feature). Default: False
dropout – If non-zero, introduces a Dropout layer on the outputs of each LSTM layer except the last layer, with dropout probability equal to dropout. Default: 0
bidirectional – If True, becomes a bidirectional LSTM. Default: False

Inputs: input, (h_0, c_0)

input of shape (seq_len, batch, input_size): tensor containing the features of the input sequence. The input can also be a packed variable length sequence. See torch.nn.utils.rnn.pack_padded_sequence or torch.nn.utils.rnn.pack_sequence for details.
h_0 of shape (num_layers * num_directions, batch, hidden_size): tensor containing the initial hidden state for each element in the batch. If the LSTM is bidirectional, num_directions should be 2, else it should be 1.
c_0 of shape (num_layers * num_directions, batch, hidden_size): tensor containing the initial cell state for each element in the batch. If (h_0, c_0) is not provided, both h_0 and c_0 default to zero.

Outputs: output, (h_n, c_n)

output of shape (seq_len, batch, num_directions * hidden_size): tensor containing the output features (h_t) from the last layer of the LSTM, for each t. If a :class:torch.nn.utils.rnn.PackedSequence has been given as the input, the output will also be a packed sequence. For the unpacked case, the directions can be separated using output.view(seq_len, batch, num_directions, hidden_size), with forward and backward being direction 0 and 1 respectively. Similarly, the directions can be separated in the packed case. 從shape來看, 實際上這是所有 h_t 的輸出. 所以對于 seq2seq 來說這個結果就夠用了。

h_n of shape (num_layers * num_directions, batch, hidden_size): tensor containing the hidden state for t = seq_len. Like output, the layers can be separated using h_n.view(num_layers, num_directions, batch, hidden_size) and similarly for c_n. 從這個shaple num_layers * num_directions 來看狞玛，說明這是每一層的最后一個輸出软驰。對于最上面的一層來說，這個是包含最后一個輸出的心肪。也就是 -1, batch, hidden_size 就是最后一個輸出锭亏。特別注意，即使設置了 batch _first=True, h_n 的維度依然是 num_layers * num_directions, batch, hidden_size 如果要轉換需要 h_n.transpose(0, 1) 來轉換到 batch, num_layers * num_directions, hidden_size

c_n of shape (num_layers * num_directions, batch, hidden_size): tensor containing the cell state for t = seq_len. 與 h_n 類似, 是每一層的 cell_state 輸出硬鞍。

變長序列處理

我們知道 lstm, 處理的是定長序列慧瘤，那么，對于nlp來說固该，變長序列該如何處理?

對于變長序列處理锅减，采用了:

pack_padded_sequence: 將變長序列打包
pad_packed_sequence: 將打包的結果解包

pack_padded_sequence

在理解這兩個函數(shù)之前，我們看看lstm是如何進行運行的伐坏。

No.	$w_0$	$w_1$	$w_2$	$w_3$	$w_4$
0	I	love	mom	'	cooking
1	Yes	0	0	0	0
2	No	way	0	0	0
3	I	love	you	too	!
4	This	is	the	shit	0

一共有5個序列怔匣，長度各種各樣。其中的 0 表示的是padding. 因為一個batch必須是有同樣維度的才可以桦沉。

$shape=BatchSize \times SequeceLength \times HiddenDim=1 \times 5 \times *$

對于lstm的實際運行中會將長度一樣的放在每瞒，這樣能夠批量運行同一批。所以會按照長度進行排序纯露。重新排列的結果如下:

No.	$w_0$	$w_1$	$w_2$	$w_3$	$w_4$
0	I	love	mom	'	cooking
3	I	love	you	too	!
4	This	is	the	shit
2	No	way
1	Yes

這個過程就是pack的過程剿骨。pack之后，會重新將長度一樣的放在一起苔埋，因為是長度一樣的放在一起懦砂，那么，也就是將padding的0全部去掉后的排列結果。

示例圖:

pack_pad.jpg

轉載鏈接

batch_sizes: 看著其中不同的顏色荞膘。綠色:5, 橘色: 4 以此類推罚随。那么, pytorch在實際運算的時候是如何運算的呢？

批量運算羽资，會一次性將上面所有數(shù)據(jù)進行運算淘菩。大體流程:

設置循環(huán)步數(shù)，為 max_length=batch_sizes[0], 這里就是5
開始循環(huán) i=0:
- 設置輸入為 batch_size[i] 進入 lstm cell 批量運算. (i=0 時是綠色的5個, i=1時是橘色的4個屠升，一次類推)
- i = i + 1

這樣每一次處理實際上是運算遞減的潮改，同時，進行的也是序列實際長度的lstm運算腹暖。（PS: 早期汇在，因為看到padding, 所以以為會將padding一起運算，這樣就不用進行pack了脏答，但是, 這樣會增加運算量糕殉，同時, 對于最后一個輸出 $h_n, c_n$ 不是實際序列最后一個輸出，而是 padding后的輸出殖告。而看了pytorch源碼后阿蝶，理解，在每一次的 lstm cell 運算黄绩，會重新取batch, 而這個batch是變化羡洁，與實際sequence長度一致. 從這個角度來看，我覺得之所以pack爽丹，對長度排序筑煮，是為了方便每一次 lstm cell 取batch 方便運算; 如果不排序，每一次通過mask取會在lstm循環(huán)運算的時候效率較低)

pytorch c++ 源碼 aten/src/ATen/native/RNN.cpp:

template<typename hidden_type, typename cell_params>

struct PackedLayer : Layer<PackedSequence, hidden_type, cell_params> {

using output_type = typename Layer<PackedSequence, hidden_type, cell_params>::output_type;

  PackedLayer(Cell<hidden_type, cell_params>& cell)
    : cell_(cell) {};

output_type operator()(

    const PackedSequence& input, 
    const hidden_type& input_hidden, 
    const cell_params& params) const override
{
    std::vector<at::Tensor> step_outputs;
    
    std::vector<hidden_type> hiddens;
    int64_t input_offset = 0;
    int64_t num_steps = input.batch_sizes.size(0);
    int64_t* batch_sizes = input.batch_sizes.data<int64_t>();
    int64_t last_batch_size = batch_sizes[0];

    // Batch sizes is a sequence of decreasing lengths, which are offsets
    // into a 1D list of inputs. At every step we slice out batch_size elements,
    // and possibly account for the decrease in the batch size since the last step,
    // which requires us to slice the hidden state (since some sequences
    // are completed now). The sliced parts are also saved, because we will need
    // to return a tensor of final hidden state.
    auto hidden = input_hidden;
    for (int64_t i = 0; i < num_steps; ++i) {
      int64_t batch_size = batch_sizes[i];
      auto step_input = input.data.narrow(0, input_offset, batch_size);
      input_offset += batch_size;

      int64_t dec = last_batch_size - batch_size;
      if (dec > 0) {
        hiddens.push_back(hidden_slice(hidden, last_batch_size - dec, last_batch_size));
        hidden = hidden_slice(hidden, 0, last_batch_size - dec);
      }

      last_batch_size = batch_size;
      hidden = cell_(step_input, hidden, params);
      step_outputs.push_back(hidden_as_output(hidden));
    }
    hiddens.push_back(hidden);
    std::reverse(hiddens.begin(), hiddens.end());

    return { PackedSequence{ at::cat(step_outputs, 0), input.batch_sizes }, hidden_concat(hiddens) };
  }

  Cell<hidden_type, cell_params>& cell_;
};

解釋:

num_steps: 就是最長的batch 其實也就是最長的sequence length
input.data.narrow(0, input_offset, batch_size): 從 batch_size 中取每一步 lstm cell 要運算的 所有sequence $x_t$ , 對應到代碼就是 $x_{input\_offset}$
dec = last_batch_size - batch_size: 對于其他沒有參與到 lstm cell 運算的习劫，用0來補上咆瘟，保證lstm運算后，所有的 sequence hidden layer 長度是一樣的诽里。

結論: 在pack后的變長序列袒餐，運算每一步都是有效運算。所以在來看 lstm 的輸出

output: shape (seq_len, batch, num_directions * hidden_size) 是包含padding的序列長度(padding的為0)
h_n: shape (num_layers * num_directions, batch, hidden_size), 實際的sequence 長度計算的lstm 最后一個隱層谤狡，與padding無關. 從shape看是所有l(wèi)ayer的最后一個隱層.
c_n: shape (num_layers * num_directions, batch, hidden_size), 實際的sequence 長度計算的 lstm 最后一個 cell state 與padding無關. 從shape看是所有l(wèi)ayer的最后一層

enforce_sorted 參數(shù)

這個參數(shù)特別說明一下, 默認是 True, 也就是說輸入的batch sequence 必須是按照長度降序排好序的灸眼。

如果這個參數(shù)是 False, 那么，這個排序的工作會由 pack_padded_sequence 來做墓懂。

lstm demo

下面的demo包含了前面說的參數(shù)設置焰宣。

import torch
from torch.nn.modules.rnn import LSTM
from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence

X_SORTED = torch.tensor([
    [
        [1, 2], [3, 4], [0, 0]
    ],
    [
        [5, 6], [7, 8], [0, 0]
    ],
    [
        [9, 10], [0, 0], [0, 0]
    ]


], dtype=torch.float)

X_UNSORTED = torch.tensor([
    [
        [1, 2], [3, 4], [0, 0]
    ],

    [
        [9, 10], [0, 0], [0, 0]
    ],
    [
        [5, 6], [7, 8], [0, 0]
    ]
], dtype=torch.float)

sequence_length_sorted = torch.tensor([2, 2, 1], dtype=torch.long)
sequence_length_unsorted = torch.tensor([2, 1, 2], dtype=torch.long)


def demo_pack_padded_sequence(x, sequence_length, is_sorted):
    print(f"x: {x.numpy()}")
    print(f"Batch size: {x.shape[0]}, "
          f"Sequence length: {x.shape[1]}, "
          f"hidden dim: {x.shape[2]}")

    print(f"sequence length: {sequence_length.numpy()}")

    pack = pack_padded_sequence(input=x,
                                lengths=sequence_length,
                                batch_first=True,
                                enforce_sorted=is_sorted)

    print(f"pack: {pack}")

    pad = pad_packed_sequence(sequence=pack, batch_first=True)
    print(f"pad: {pad}")

    lstm = LSTM(input_size=x.shape[-1],
                hidden_size=4,
                num_layers=1,
                batch_first=True,
                bidirectional=False)
    output, (hn, cn) = lstm(pack)

    print("output", "-" * 80)
    print(output)

    pad_output = pad_packed_sequence(output, batch_first=True, padding_value=0.0)
    print("+" * 80)
    print(f"output: {pad_output}")
    print(f"hn: {hn}")
    print(f"cn: {cn}")
    print(f"output[:-1:]: {output[:-1:]}")

demo_pack_padded_sequence(x=X_SORTED,
                              sequence_length=sequence_length_sorted,
                              is_sorted=True)

x: [[[ 1.  2.]
  [ 3.  4.]
  [ 0.  0.]]

 [[ 5.  6.]
  [ 7.  8.]
  [ 0.  0.]]

 [[ 9. 10.]
  [ 0.  0.]
  [ 0.  0.]]]
Batch size: 3, Sequence length: 3, hidden dim: 2
sequence length: [2 2 1]
pack: PackedSequence(data=tensor([[ 1.,  2.],
        [ 5.,  6.],
        [ 9., 10.],
        [ 3.,  4.],
        [ 7.,  8.]]), batch_sizes=tensor([3, 2]), sorted_indices=None, unsorted_indices=None)
pad: (tensor([[[ 1.,  2.],
         [ 3.,  4.]],

        [[ 5.,  6.],
         [ 7.,  8.]],

        [[ 9., 10.],
         [ 0.,  0.]]]), tensor([2, 2, 1]))
output --------------------------------------------------------------------------------
PackedSequence(data=tensor([[-0.0644,  0.1670,  0.1466,  0.0274],
        [-0.0926,  0.1344,  0.4401,  0.0731],
        [-0.0940,  0.1009,  0.5912,  0.0238],
        [-0.1254,  0.2073,  0.3820,  0.0958],
        [-0.1485,  0.1642,  0.5456,  0.0616]], grad_fn=<CatBackward>), batch_sizes=tensor([3, 2]), sorted_indices=None, unsorted_indices=None)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
output: (tensor([[[-0.0644,  0.1670,  0.1466,  0.0274],
         [-0.1254,  0.2073,  0.3820,  0.0958]],

        [[-0.0926,  0.1344,  0.4401,  0.0731],
         [-0.1485,  0.1642,  0.5456,  0.0616]],

        [[-0.0940,  0.1009,  0.5912,  0.0238],
         [ 0.0000,  0.0000,  0.0000,  0.0000]]], grad_fn=<TransposeBackward0>), tensor([2, 2, 1]))
hn: tensor([[[-0.1254,  0.2073,  0.3820,  0.0958],
         [-0.1485,  0.1642,  0.5456,  0.0616],
         [-0.0940,  0.1009,  0.5912,  0.0238]]], grad_fn=<StackBackward>)
cn: tensor([[[-0.1698,  0.4718,  0.7097,  0.4341],
         [-0.1641,  0.3882,  0.8470,  1.2613],
         [-0.0996,  0.2737,  0.8373,  0.9169]]], grad_fn=<StackBackward>)
output[:-1:]: (tensor([[-0.0644,  0.1670,  0.1466,  0.0274],
        [-0.0926,  0.1344,  0.4401,  0.0731],
        [-0.0940,  0.1009,  0.5912,  0.0238],
        [-0.1254,  0.2073,  0.3820,  0.0958],
        [-0.1485,  0.1642,  0.5456,  0.0616]], grad_fn=<CatBackward>), tensor([3, 2]), None)

特別提示: hn 的結果是包含在 output 中的, 如何從 output 中提取出 hn 參考習題4

lstm 應用

從前面闡述，明白了lstm實際的原理和輸出捕仔。那么匕积，在實際應用的時候盈罐，如果是使用每一個時間步的隱層進行運算, 那么，要注意將mask進入運算闪唆，因為輸出的隱層是包含padding部分的盅粪。當然可以利用,padding的值全是0，是有些便利計算方法的悄蕾，但是不推薦票顾，要使用mask運算。

如果是使用lstm最后一個隱層的輸出帆调，那么奠骄，直接使用就可以了。

習題

sequence 長度是 $100$ , $x_t$ embedding 維度是 $200$ , 隱層輸出維度是 $300$ , 計算 lstm 參數(shù)是多少?
lstm 對變長序列padding番刊，在實際計算lstm cell的時候 padding 部分是否參與計算? 如果不參與計算含鳞，lstm是如何進行變長計算的?
lstm 輸出 output(也就是每個時間步的hidden輸出) 是否包含 h_n 輸出？如果不包含撵枢，請說明情況?
使用h_n, 如何提取最后一個最后的最后一層的 hidden 輸出（最后的hidden輸出常常作為整個句子的編碼結果); 在不使用 h_n 的情況下, 使用 lstm 輸出 output(也就是每個時間步的hidden輸出), 如何提取出最后一個最后一層的 hidden 輸出?
是否注意到了mask的使用民晒？

最后編輯于：2020.04.26 19:59:10

?著作權歸作者所有,轉載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末精居，一起剝皮案震驚了整個濱河市锄禽，隨后出現(xiàn)的幾起案子，更是在濱河造成了極大的恐慌靴姿，老刑警劉巖沃但，帶你破解...
沈念sama閱讀 216,372評論 6贊 498
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件，死亡現(xiàn)場離奇詭異佛吓，居然都是意外死亡宵晚，警方通過查閱死者的電腦和手機，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 92,368評論 3贊 392
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進店門维雇，熙熙樓的掌柜王于貴愁眉苦臉地迎上來淤刃，“玉大人，你說我怎么就攤上這事吱型∫菁郑” “怎么了？”我有些...
開封第一講書人閱讀 162,415評論 0贊 353
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵津滞，是天一觀的道長铝侵。經(jīng)常有香客問我，道長触徐，這世上最難降的妖魔是什么咪鲜？我笑而不...
開封第一講書人閱讀 58,157評論 1贊 292
?港島之戀（遺憾婚禮）
正文為了忘掉前任，我火速辦了婚禮撞鹉，結果婚禮上疟丙，老公的妹妹穿的比我還像新娘颖侄。我一直安慰自己，他們只是感情好享郊，可當我...
茶點故事閱讀 67,171評論 6贊 388
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布发皿。她就那樣靜靜地躺著，像睡著了一般拂蝎。火紅的嫁衣襯著肌膚如雪穴墅。梳的紋絲不亂的頭發(fā)上，一...
開封第一講書人閱讀 51,125評論 1贊 297
城市分裂傳說
那天温自，我揣著相機與錄音玄货，去河邊找鬼。笑死悼泌，一個胖子當著我的面吹牛松捉，可吹牛的內(nèi)容都是我干的。我是一名探鬼主播馆里，決...
沈念sama閱讀 40,028評論 3贊 417
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼隘世，長吁一口氣：“原來是場噩夢啊……” “哼！你這毒婦竟也來了鸠踪？” 一聲冷哼從身側響起丙者，我...
開封第一講書人閱讀 38,887評論 0贊 274
萬榮殺人案實錄
序言：老撾萬榮一對情侶失蹤，失蹤者是張志新（化名）和其女友劉穎营密，沒想到半個月后械媒，有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體，經(jīng)...
沈念sama閱讀 45,310評論 1贊 310
?護林員之死
正文獨居荒郊野嶺守林人離奇死亡评汰，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點故事閱讀 37,533評論 2贊 332
?白月光啟示錄
正文我和宋清朗相戀三年纷捞，在試婚紗的時候發(fā)現(xiàn)自己被綠了。大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片被去。...
茶點故事閱讀 39,690評論 1贊 348
活死人
序言：一個原本活蹦亂跳的男人離奇死亡主儡，死狀恐怖，靈堂內(nèi)的尸體忽然破棺而出惨缆，到底是詐尸還是另有隱情糜值，我是刑警寧澤，帶...
沈念sama閱讀 35,411評論 5贊 343
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布踪央，位于F島的核電站臀玄，受9級特大地震影響，放射性物質發(fā)生泄漏畅蹂。R本人自食惡果不足惜健无，卻給世界環(huán)境...
茶點故事閱讀 41,004評論 3贊 325
男人毒藥：我在死后第九天來索命
文/蒙蒙一、第九天我趴在偏房一處隱蔽的房頂上張望液斜。院中可真熱鬧累贤，春花似錦叠穆、人聲如沸。這莊子的主人今日做“春日...
開封第一講書人閱讀 31,659評論 0贊 22
一樁弒父案硼被，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽。三九已至渗磅，卻和暖如春嚷硫，著一層夾襖步出監(jiān)牢的瞬間，已是汗流浹背始鱼。一陣腳步聲響...
開封第一講書人閱讀 32,812評論 1贊 268
情欲美人皮
我被黑心中介騙來泰國打工仔掸，沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留，地道東北人医清。一個月前我還...
沈念sama閱讀 47,693評論 2贊 368
代替公主和親
正文我出身青樓起暮，卻偏偏與公主長得像，于是被迫代替她去往敵國和親会烙。傳聞我的和親對象是個殘疾皇子负懦，可洞房花燭夜當晚...
茶點故事閱讀 44,577評論 2贊 353

LSTM原理荸实、源碼、Demo及習題