一步一步創(chuàng)建RNN模型

序列化模型（rnn）對(duì)于自然語(yǔ)言處理和一些序列化的任務(wù)是非常有作用的，因?yàn)樗麄兪谴嬖凇坝洃洝惫δ艿?br> 說(shuō)明：
上標(biāo)[l]：表示與第 l 層相關(guān)聯(lián)的對(duì)象
上標(biāo)(i)：表示與第 i 個(gè)示例關(guān)聯(lián)的對(duì)象
上標(biāo)<t>：表示第 t 時(shí)間步處的對(duì)象
下標(biāo) i ：表示向量的第 i 項(xiàng)

導(dǎo)入我們所需的包

import numpy as np
from rnn_utils import *

1. 基本RNN的前向傳播

RNN基本結(jié)構(gòu)（本例Tx=Ty）：

實(shí)現(xiàn)步驟

實(shí)現(xiàn)RNN的一個(gè)時(shí)間步驟所需的計(jì)算倘屹。
在Tx時(shí)間步上實(shí)現(xiàn)一個(gè)循環(huán)，一次處理一個(gè)輸入值。

1.1 實(shí)現(xiàn)一個(gè)rnn單元

循環(huán)（遞歸）神經(jīng)網(wǎng)絡(luò)可以看作是單個(gè)細(xì)胞的重復(fù)拍谐。首先要實(shí)現(xiàn)一個(gè)時(shí)間步的計(jì)算。

實(shí)現(xiàn)步驟

計(jì)算隱藏狀態(tài)的值a<t>
根據(jù)a<t>,計(jì)算預(yù)測(cè)值yhat<t>
將(a<t>, a<t-1>, x<t>, parameters)存儲(chǔ)到cache中
返回值a<t>, y<t> and cache

我們將矢量化m個(gè)例子气嫁，x<t>尺寸（nx, m）,a<t>尺寸(na, m)

def rnn_cell_forward(xt, a_prev, parameters):
    """
    Arguments:
                     xt  shape: (n_x, m).
                     a_prev  shape: (n_a, m)
    parameters -- python dictionary containing:
                        Wax --  shape (n_a, n_x)
                        Waa --  shape (n_a, n_a)
                        Wya --  shape (n_    y, n_a)
                        ba --   shape (n_a, 1)
                        by --  shape (n_y, 1)
    Returns:
    a_next -- shape (n_a, m)
    yt_pred -- shape (n_y, m)
    cache -- 元組当窗，包含向后傳遞所需的值 (a_next, a_prev, xt, parameters)
    """
    # 從"parameters"中檢索參數(shù)
    Wax = parameters["Wax"]
    Waa = parameters["Waa"]
    Wya = parameters["Wya"]
    ba = parameters["ba"]
    by = parameters["by"]
     
    a_next = np.tanh(np.add(np.add(np.matmul(Wax,xt),np.matmul(Waa,a_prev)),ba))
 
    yt_pred = softmax(np.add(np.matmul(Wya,a_next),by))
    
    # store values you need for backward propagation in cache
    cache = (a_next, a_prev, xt, parameters)
    
    return a_next, yt_pred, cache

1.2 rnn向前傳播

rnn可以看作將前面的rnn細(xì)胞進(jìn)行重復(fù)寸宵，如果輸入的數(shù)據(jù)序列經(jīng)過(guò)10個(gè)時(shí)間步梯影，那將復(fù)制rnn單元10次巫员，每個(gè)單元都將前一個(gè)單元的隱藏狀態(tài)a<t-1>和當(dāng)前時(shí)間步的輸入數(shù)據(jù)x<t>作為輸入，它輸出該時(shí)間步的隱藏狀態(tài)a<t>和預(yù)測(cè)y<t>
實(shí)現(xiàn)步驟

創(chuàng)建一個(gè)零向量（a）甲棍，它將存儲(chǔ)由RNN計(jì)算的所有隱藏狀態(tài)简识。
初始化隱藏狀態(tài)a_next = a0
開(kāi)始循環(huán)每個(gè)時(shí)間步，增量索引為t:

通過(guò)運(yùn)行rnn_cell_forward更新下一個(gè)隱藏狀態(tài)和緩存
存儲(chǔ)隱藏狀態(tài)到 a
存儲(chǔ)預(yù)測(cè)值到 y
添加cache到caches

返回a, y 和 caches

def rnn_forward(x, a0, parameters):
    """
    說(shuō)明:
    x -- 輸入感猛，shape (n_x, m, T_x).
    a0 -- 初始隱藏狀態(tài)七扰，shape (n_a, m)
    parameters -- python字典包含:
                        Waa -- shape (n_a, n_a)
                        Wax -- shape (n_a, n_x)
                        Wya -- shape (n_y, n_a)
                        ba --   shape (n_a, 1)
                        by --  shape (n_y, 1)
    Returns:
    a -- shape (n_a, m, T_x)
    y_pred -- shape (n_y, m, T_x)
    caches --元組，包含向后傳遞所需的值 (list of caches, x)
    """    
    # 初始化陪白，包含caches列表
    caches = []   
    n_x, m, T_x = x.shape
    n_y, n_a = parameters["Wya"].shape
    # 初始化
    a = np.zeros([n_a, m,T_x])
    y_pred = np.zeros([n_y, m, T_x])
    # 初始化a_next 為a0
    a_next = a0
    
    # 循環(huán)time-steps
    for t in range(T_x):
        a_next, yt_pred, cache = rnn_cell_forward(x[:,:,t], a_next, parameters)
        a[:,:,t] = a_next
        y_pred[:,:,t] = yt_pred
        caches.append(cache)
       
    # store values needed for backward propagation in cache
    caches = (caches, x)
    
    return a, y_pred, caches

2.長(zhǎng)短期記憶（LSTM）

LSTM基本結(jié)構(gòu)：

關(guān)于門(mén)

遺忘門(mén)
更新門(mén)
更新單元

新?tīng)顟B(tài)為：
輸出門(mén)

2.1 LSTM單元

實(shí)現(xiàn)步驟

將a<t-1>和x<t>上下連接起來(lái)颈走，組成矩陣concat
計(jì)算前面的公式
計(jì)算預(yù)測(cè)值y<t>

def lstm_cell_forward(xt, a_prev, c_prev, parameters):
    """
    Arguments:
    xt -- shape (n_x, m).
    a_prev --  shape (n_a, m)
    c_prev --  shape (n_a, m)
    parameters -- python 字典:
                        Wf --  shape (n_a, n_a + n_x)
                        bf -- shape (n_a, 1)

                        Wi --  shape (n_a, n_a + n_x)
                        bi -- shape (n_a, 1)

                        Wc -- shape (n_a, n_a + n_x)
                        bc --  shape (n_a, 1)

                        Wo -- shape (n_a, n_a + n_x)
                        bo --   shape (n_a, 1)

                        Wy -- shape (n_y, n_a)
                        by -- shape (n_y, 1)
                        
    Returns:
    a_next -- shape (n_a, m)
    c_next --  shape (n_a, m)
    yt_pred --  shape (n_y, m)
    cache --元組，包含向后傳遞所需的值(a_next, c_next, a_prev, c_prev, xt, parameters)
    
    注意: ft / it / ot 代表 遺忘門(mén)咱士、更新門(mén)立由、輸出門(mén)
    cct ：代表候選值 (c tilda),
    c ：代表記憶值
    """
    # 檢索參數(shù)
    Wf = parameters["Wf"]
    bf = parameters["bf"]
    Wi = parameters["Wi"]
    bi = parameters["bi"]
    Wc = parameters["Wc"]
    bc = parameters["bc"]
    Wo = parameters["Wo"]
    bo = parameters["bo"]
    Wy = parameters["Wy"]
    by = parameters["by"]
    
    n_x, m = xt.shape
    n_y, n_a = Wy.shape

    # 連接 a_prev 和 xt 在一個(gè)矩陣中
    concat = np.zeros([n_x+n_a,m])
    concat[: n_a, :] = a_prev
    concat[n_a :, :] = xt

    # 計(jì)算 ft, it, cct, c_next, ot, a_next 用下面的公式：
    ft = sigmoid(np.dot(Wf,concat) + bf)
    it = sigmoid(np.dot(Wi,concat) + bi)
    cct = np.tanh(np.dot(Wc,concat) + bc)

    c_next = ft*c_prev + it*cct
    ot = sigmoid(np.dot(Wo,concat) + bo)
    a_next = ot * np.tanh(c_next)
    
    # 計(jì)算預(yù)測(cè)值
    yt_pred = softmax(np.dot(Wy,a_next) + by)
  
    # 存儲(chǔ)反向傳播需要的值到 cache
    cache = (a_next, c_next, a_prev, c_prev, ft, it, cct, ot, xt, parameters)

    return a_next, c_next, yt_pred, cache

2.2 LSTM向前傳播

現(xiàn)在您已經(jīng)實(shí)現(xiàn)了LSTM的一個(gè)步驟轧钓，現(xiàn)在可以使用for循環(huán)來(lái)處理一系列Tx輸入。

def lstm_forward(x, a0, parameters):
    """
       Arguments:
     x --  shape (n_x, m, T_x).
    a0 --  shape (n_a, m)
    parameters -- python 字典:
                        Wf --  shape (n_a, n_a + n_x)
                        bf -- shape (n_a, 1)

                        Wi --  shape (n_a, n_a + n_x)
                        bi --  shape (n_a, 1)

                        Wc -- shape (n_a, n_a + n_x)
                        bc --  shape (n_a, 1)

                        Wo --   shape (n_a, n_a + n_x)
                        bo --  shape (n_a, 1)

                        Wy --   shape (n_y, n_a)
                        by --   shape (n_y, 1)
                        
    Returns:
    a --  shape (n_a, m, T_x)
    y --  shape (n_y, m, T_x)
    caches -- 元組锐膜，包含向后傳遞所需的值 (list of all the caches, x)
    """
    caches = []
  
    n_x, m, T_x = x.shape
    n_y, n_a = parameters["Wy"].shape
    
    # initialize "a", "c" and "y" with zeros (≈3 lines)
    a = np.zeros((n_a, m, T_x))
    c = np.zeros((n_a, m, T_x))
    y = np.zeros((n_y, m, T_x))
    
   # 初始化a_next and c_next 
    a_next = a0
    c_next = np.zeros((n_a, m))
    
    # 循環(huán) time-steps
    for t in range(T_x):
        a_next, c_next, yt, cache = lstm_cell_forward(x[:,:,t], a_next, c_next, parameters)
        a[:,:,t] = a_next
        y[:,:,t] = yt
        c[:,:,t]  = c_next
        caches.append(cache)
        
     caches = (caches, x)

    return a, y, c, caches

3. 反向傳播

用框架實(shí)現(xiàn)毕箍。。道盏。

備注
文件rnn_utils代碼如下：

import numpy as np

def softmax(x):
    e_x = np.exp(x - np.max(x))
    return e_x / e_x.sum(axis=0)


def sigmoid(x):
    return 1 / (1 + np.exp(-x))


def initialize_adam(parameters) :
    """
    Initializes v and s as two python dictionaries with:
                - keys: "dW1", "db1", ..., "dWL", "dbL" 
                - values: numpy arrays of zeros of the same shape as the corresponding gradients/parameters.
    
    Arguments:
    parameters -- python dictionary containing your parameters.
                    parameters["W" + str(l)] = Wl
                    parameters["b" + str(l)] = bl
    
    Returns: 
    v -- python dictionary that will contain the exponentially weighted average of the gradient.
                    v["dW" + str(l)] = ...
                    v["db" + str(l)] = ...
    s -- python dictionary that will contain the exponentially weighted average of the squared gradient.
                    s["dW" + str(l)] = ...
                    s["db" + str(l)] = ...

    """
    
    L = len(parameters) // 2 # number of layers in the neural networks
    v = {}
    s = {}
    
    # Initialize v, s. Input: "parameters". Outputs: "v, s".
    for l in range(L):
    ### START CODE HERE ### (approx. 4 lines)
        v["dW" + str(l+1)] = np.zeros(parameters["W" + str(l+1)].shape)
        v["db" + str(l+1)] = np.zeros(parameters["b" + str(l+1)].shape)
        s["dW" + str(l+1)] = np.zeros(parameters["W" + str(l+1)].shape)
        s["db" + str(l+1)] = np.zeros(parameters["b" + str(l+1)].shape)
    ### END CODE HERE ###
    
    return v, s


def update_parameters_with_adam(parameters, grads, v, s, t, learning_rate = 0.01,
                                beta1 = 0.9, beta2 = 0.999,  epsilon = 1e-8):
    """
    Update parameters using Adam
    
    Arguments:
    parameters -- python dictionary containing your parameters:
                    parameters['W' + str(l)] = Wl
                    parameters['b' + str(l)] = bl
    grads -- python dictionary containing your gradients for each parameters:
                    grads['dW' + str(l)] = dWl
                    grads['db' + str(l)] = dbl
    v -- Adam variable, moving average of the first gradient, python dictionary
    s -- Adam variable, moving average of the squared gradient, python dictionary
    learning_rate -- the learning rate, scalar.
    beta1 -- Exponential decay hyperparameter for the first moment estimates 
    beta2 -- Exponential decay hyperparameter for the second moment estimates 
    epsilon -- hyperparameter preventing division by zero in Adam updates

    Returns:
    parameters -- python dictionary containing your updated parameters 
    v -- Adam variable, moving average of the first gradient, python dictionary
    s -- Adam variable, moving average of the squared gradient, python dictionary
    """
    
    L = len(parameters) // 2                 # number of layers in the neural networks
    v_corrected = {}                         # Initializing first moment estimate, python dictionary
    s_corrected = {}                         # Initializing second moment estimate, python dictionary
    
    # Perform Adam update on all parameters
    for l in range(L):
        # Moving average of the gradients. Inputs: "v, grads, beta1". Output: "v".
        ### START CODE HERE ### (approx. 2 lines)
        v["dW" + str(l+1)] = beta1 * v["dW" + str(l+1)] + (1 - beta1) * grads["dW" + str(l+1)] 
        v["db" + str(l+1)] = beta1 * v["db" + str(l+1)] + (1 - beta1) * grads["db" + str(l+1)] 
        ### END CODE HERE ###

        # Compute bias-corrected first moment estimate. Inputs: "v, beta1, t". Output: "v_corrected".
        ### START CODE HERE ### (approx. 2 lines)
        v_corrected["dW" + str(l+1)] = v["dW" + str(l+1)] / (1 - beta1**t)
        v_corrected["db" + str(l+1)] = v["db" + str(l+1)] / (1 - beta1**t)
        ### END CODE HERE ###

        # Moving average of the squared gradients. Inputs: "s, grads, beta2". Output: "s".
        ### START CODE HERE ### (approx. 2 lines)
        s["dW" + str(l+1)] = beta2 * s["dW" + str(l+1)] + (1 - beta2) * (grads["dW" + str(l+1)] ** 2)
        s["db" + str(l+1)] = beta2 * s["db" + str(l+1)] + (1 - beta2) * (grads["db" + str(l+1)] ** 2)
        ### END CODE HERE ###

        # Compute bias-corrected second raw moment estimate. Inputs: "s, beta2, t". Output: "s_corrected".
        ### START CODE HERE ### (approx. 2 lines)
        s_corrected["dW" + str(l+1)] = s["dW" + str(l+1)] / (1 - beta2 ** t)
        s_corrected["db" + str(l+1)] = s["db" + str(l+1)] / (1 - beta2 ** t)
        ### END CODE HERE ###

        # Update parameters. Inputs: "parameters, learning_rate, v_corrected, s_corrected, epsilon". Output: "parameters".
        ### START CODE HERE ### (approx. 2 lines)
        parameters["W" + str(l+1)] = parameters["W" + str(l+1)] - learning_rate * v_corrected["dW" + str(l+1)] / np.sqrt(s_corrected["dW" + str(l+1)] + epsilon)
        parameters["b" + str(l+1)] = parameters["b" + str(l+1)] - learning_rate * v_corrected["db" + str(l+1)] / np.sqrt(s_corrected["db" + str(l+1)] + epsilon)
        ### END CODE HERE ###

    return parameters, v, s

最后編輯于：2019.11.13 18:35:21

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者

人面猴
序言：七十年代末而柑，一起剝皮案震驚了整個(gè)濱河市，隨后出現(xiàn)的幾起案子荷逞，更是在濱河造成了極大的恐慌牺堰，老刑警劉巖，帶你破解...
沈念sama閱讀 222,104評(píng)論 6贊 515
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件颅围，死亡現(xiàn)場(chǎng)離奇詭異，居然都是意外死亡恨搓，警方通過(guò)查閱死者的電腦和手機(jī)院促，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 94,816評(píng)論 3贊 399
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門(mén)，熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái)斧抱，“玉大人常拓，你說(shuō)我怎么就攤上這事』云郑” “怎么了弄抬？”我有些...
開(kāi)封第一講書(shū)人閱讀 168,697評(píng)論 0贊 360
道士緝兇錄：失蹤的賣(mài)姜人
文/不壞的土叔我叫張陵，是天一觀的道長(zhǎng)宪郊。經(jīng)常有香客問(wèn)我掂恕，道長(zhǎng)，這世上最難降的妖魔是什么弛槐？我笑而不...
開(kāi)封第一講書(shū)人閱讀 59,836評(píng)論 1贊 298
?港島之戀（遺憾婚禮）
正文為了忘掉前任懊亡，我火速辦了婚禮，結(jié)果婚禮上乎串，老公的妹妹穿的比我還像新娘店枣。我一直安慰自己，他們只是感情好叹誉，可當(dāng)我...
茶點(diǎn)故事閱讀 68,851評(píng)論 6贊 397
惡毒庶女頂嫁案：這布局不是一般人想出來(lái)的
文/花漫我一把揭開(kāi)白布鸯两。她就那樣靜靜地躺著，像睡著了一般长豁。火紅的嫁衣襯著肌膚如雪钧唐。梳的紋絲不亂的頭發(fā)上，一...
開(kāi)封第一講書(shū)人閱讀 52,441評(píng)論 1贊 310
城市分裂傳說(shuō)
那天蕉斜，我揣著相機(jī)與錄音逾柿，去河邊找鬼缀棍。笑死，一個(gè)胖子當(dāng)著我的面吹牛机错，可吹牛的內(nèi)容都是我干的爬范。我是一名探鬼主播，決...
沈念sama閱讀 40,992評(píng)論 3贊 421
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開(kāi)眼弱匪，長(zhǎng)吁一口氣：“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼青瀑！你這毒婦竟也來(lái)了？” 一聲冷哼從身側(cè)響起萧诫，我...
開(kāi)封第一講書(shū)人閱讀 39,899評(píng)論 0贊 276
萬(wàn)榮殺人案實(shí)錄
序言：老撾萬(wàn)榮一對(duì)情侶失蹤斥难，失蹤者是張志新（化名）和其女友劉穎，沒(méi)想到半個(gè)月后帘饶，有當(dāng)?shù)厝嗽跇?shù)林里發(fā)現(xiàn)了一具尸體哑诊，經(jīng)...
沈念sama閱讀 46,457評(píng)論 1贊 318
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡，尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 38,529評(píng)論 3贊 341
?白月光啟示錄
正文我和宋清朗相戀三年及刻，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了镀裤。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
茶點(diǎn)故事閱讀 40,664評(píng)論 1贊 352
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡缴饭，死狀恐怖暑劝，靈堂內(nèi)的尸體忽然破棺而出，到底是詐尸還是另有隱情颗搂，我是刑警寧澤担猛，帶...
沈念sama閱讀 36,346評(píng)論 5贊 350
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站丢氢，受9級(jí)特大地震影響傅联，放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜疚察，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 42,025評(píng)論 3贊 334
男人毒藥：我在死后第九天來(lái)索命
文/蒙蒙一纺且、第九天我趴在偏房一處隱蔽的房頂上張望。院中可真熱鬧稍浆，春花似錦载碌、人聲如沸。這莊子的主人今日做“春日...
開(kāi)封第一講書(shū)人閱讀 32,511評(píng)論 0贊 24
一樁弒父案嫁艇，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽(yáng)。三九已至弦撩，卻和暖如春步咪，著一層夾襖步出監(jiān)牢的瞬間，已是汗流浹背益楼。一陣腳步聲響...
開(kāi)封第一講書(shū)人閱讀 33,611評(píng)論 1贊 272
情欲美人皮
我被黑心中介騙來(lái)泰國(guó)打工猾漫，沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留点晴，地道東北人。一個(gè)月前我還...
沈念sama閱讀 49,081評(píng)論 3贊 377
代替公主和親
正文我出身青樓悯周，卻偏偏與公主長(zhǎng)得像粒督，于是被迫代替她去往敵國(guó)和親。傳聞我的和親對(duì)象是個(gè)殘疾皇子禽翼，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 45,675評(píng)論 2贊 359