2023-09-24 01 RNN 的前向傳播

來源：https://hyunhp.tistory.com/448

1. RNN cell 與 RNN 直觀圖示

RNN ---->? Recurrent Neural Network?

You can think of the recurrent neural network as the repeated use of a single cell忱详，the computations for a single time step.?

2. 輸入的維度Dimensions of input x

2.1 Input with? $n_{x}$ number of units

? ? For a single time step of a single input example,? $x^{(i)<t>}$ ?is a one-dimensional input vector

?? ?Using language as an example, a language with a 5000-word vocabulary could be one-hot encoded into a vector that has $n_{x}=5000$ ?units. so? $x^{(i)<t>}$ ?could have the shape (5000,)

?? The notation? $n_{x}$ ?is used here to denote the number of units in a single time step of a single training example

2.2 Time Steps of size? $T_{x}$

? A recurrent neural network has multiple time steps, which you'll be index with t.

??In the lessons, you saw a single training example? $x^{(i)}$ consisting of multiple time steps? $T_{x}$ . In this notebook,? $T_{x}$ will denote the number of timesteps in the longest sequence.

2.3 Batches of size m

? ?Let's say we have mini-batches, each with 20 training examples

? ?To benefit from vectorization, you'll stack 20 columns of? $x^{(i)}$ examples

? ?For example, the tensor has the shape (5000,20,10)

? ?You'll use m to denote the number of training examples

? ?So, the shape of a mini-batch is

2.4 3D Tensor of shape? $(n_{x},m,T_{x})$

? ?The 3-dimensional tensor x of shape? $(n_{x},m,T_{x})$ ?represents the input x that is fed into the RNN

2.5 Take a 2D slice for each time step:? $x^{<t>}$

?? At each time step, you'll use a mini-batch of training examples (not just a single example)

? So, for each time step t, you'll use a 2D slice of shape $(n_{x},m)$

? This 2D slice is referred to as? $x^{t}$ . The variable name in the code is xt.

3. 隱藏狀態(tài)的維度 hidden state a

the activation? $a^{<t>}$ ?that is passed to the RNN from one time step to another is called a "hidden state"

3.1 Dimensions of hidden state a

??Similar to the input tensor x, the hidden state for a single training example is a vector of length?

?? If you include a mini-batch or m training examples, the shape of a mini-batch is? $(n_{a},m)$

? When you include the time step dimension, the shape of the hidden state is? $(n_{a},m,T_{x})$

? You'll loop through the time steps with index t, and work with 2 2D slice of the 3D tensor

? This 2D slice is referred to as? $a^{<t>}$

?? In the code, the variable names used are either a_prev or a_next, depending on the function being implemented

?? The shape of this 2D slice is? $(n_{a},m)$

4. 輸出的維度Dimensions of prediction? $\hat{y}$

??Similar to the inputs? and hidden states,? $\hat{y}$ ?is a 3D tensor of shape? $(n_{y},m,T_{y})$

????????????■? $n_{y}$ ?: number of units in the vector representing the prediction

????????????■ m :? ? number of examples in a mini-batch

????????????■? $T_{y}$ :? number of time steps in the prediction

??For a single similar time step t, a 2D slice? $\hat{y} ^{<t>}$ ?has shape? $(n_{y},m)$

??In the code, the varriable names are:

? ? ? ? ? ? ?●? y_pred :? $\hat{y}$

? ? ? ? ? ? ?●? yt_pred :? $\hat{y} ^{<t>}$

5. 構(gòu)建RNN

?? Here is how you can implement an RNN:

Steps:

????????????● Implement the calculations needed for one time step of the RNN.

????????????● Implement a loop over $T_{x}$ time steps in order to process all the inputs, one at a time

?? 關(guān)于 RNN Cell

You can think of the recurrent neural network as the repeated use of a single cell. First, you'll implement the computations for a single time step.?

?? RNN cell? versus RNN_cell_forward:

● Note that an RNN cell outputs the hidden state? $a^{<t>}$

? ? ? ■?RNN cell is shown in the figure as the inner box with solid lines

●?The function that you'll implement, rnn_cell_forward, also calculates the prediction? $\hat{y} ^{<t>}$

? ? ?■ RNN_cell_forward is shown in the figure as the outer? ox with dashed lines

??The following figure describes the operations for a single time step of an RNN cell:

代碼如下：

# UNQ_C1 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)

# GRADED FUNCTION: rnn_cell_forward

def rnn_cell_forward(xt, a_prev, parameters):?

? ? ? """

? ? ?【代碼注釋】

? ? ? ?Implements a single forward step of the RNN-cell as described in Figure (2)

? ? ? ?Arguments:

? ? ? ?xt -- your input data at timestep "t", numpy array of shape (n_x, m).

? ? ? ?a_prev -- Hidden state at timestep "t-1", numpy array of shape (n_a, m)

? ? ? ?parameters -- python dictionary containing:

? ? ? ? ? ? ? ? ? ? ? ? Wax -- Weight matrix multiplying the input, numpy array of shape (n_a, n_x)

? ? ? ? ? ? ? ? ? ? ? ? Waa -- Weight matrix multiplying the hidden state, numpy array of shape (n_a, n_a)

? ? ? ? ? ? ? ? ? ? ? ? Wya -- Weight matrix relating the hidden-state to the output, numpy array of shape (n_y, n_a)

? ? ? ? ? ? ? ? ? ? ? ? ba --? Bias, numpy array of shape (n_a, 1)

? ? ? ? ? ? ? ? ? ? ? ? by -- Bias relating the hidden-state to the output, numpy array of shape (n_y, 1)

? ? ? ? Returns:

? ? ? ? a_next -- next hidden state, of shape (n_a, m)

? ? ? ? yt_pred -- prediction at timestep "t", numpy array of shape (n_y, m)

? ? ? ? cache -- tuple of values needed for the backward pass, contains (a_next, a_prev, xt, parameters)

? ? ? ? """

? ? # Retrieve parameters from "parameters"? ?

? ? Wax = parameters["Wax"]

? ? Waa = parameters["Waa"]

? ? Wya = parameters["Wya"]

? ? ba = parameters["ba"]

? ? by = parameters["by"]

? ? ### START CODE HERE ### (≈2 lines)? ?

? ? # compute next activation state using the formula given above? ?

? ? a_next = np.tanh(np.dot(Wax, xt) + np.dot(Waa, a_prev) + ba)

? ? # compute output of the current cell using the formula given above? ?

? ? yt_pred = softmax(np.dot(Wya, a_next) + by)

? ? ### END CODE HERE ###? ?

? ? # store values you need for backward propagation in cache? ?

? ? cache = (a_next, a_prev, xt, parameters)

? ? return a_next, yt_pred, cache

執(zhí)行上述代碼

def?rnn_cell_forward_tests(rnn_cell_forward):

????????np.random.seed(1)

????????xt_tmp = np.random.randn(3, 10)

????????a_prev_tmp = np.random.randn(5, 10)

????????parameters_tmp = {}

????????parameters_tmp['Waa'] = np.random.randn(5, 5)

????????parameters_tmp['Wax'] = np.random.randn(5, 3)

????????parameters_tmp['Wya'] = np.random.randn(2, 5)

????????parameters_tmp['ba'] = np.random.randn(5, 1)

????????parameters_tmp['by'] = np.random.randn(2, 1)

????????a_next_tmp, yt_pred_tmp, cache_tmp = rnn_cell_forward(xt_tmp, a_prev_tmp, parameters_tmp)

????????print("a_next[4] = \n", a_next_tmp[4])

????????print("a_next.shape = \n", a_next_tmp.shape)

????????print("yt_pred[1] =\n", yt_pred_tmp[1])

????????print("yt_pred.shape = \n", yt_pred_tmp.shape)

# UNIT TESTS

rnn_cell_forward_tests(rnn_cell_forward)

6. RNN前向傳播的過程 RNN Forward Pass

? A recurrent neural network (RNN) is repetition of the RNN cell that you've just built.

? ? ? ● If your input sequence of data is 10 time steps long, then you will re-use the RNN cell 10 times

? Each cell takes two inputs at each time step:

? ? ? ●? $a^{<t-1>}$ : The hidden state from the previous cell

? ? ? ● $x^{<t>}$ ?:??The current time step's input data

? It has two outputs at each time step:

? ? ? ●? A hidden state? $(a^{<t>})$

? ? ??●? A prediction? $(y^{<t>})$

? The weights biases? $(W_{aa},b_{a},W_{ax},b_{x})$ ?are resued each time step

? ? ? ●? They are maintained between calls to rnn_cell_forward in the 'parameters' dictionary

?? 上面代碼里面沒有提 $b_{x}$

# UNQ_C2 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)

# GRADED FUNCTION: rnn_forward

def rnn_forward(x, a0, parameters):

? ? """ Implement the forward propagation of the recurrent neural network described in Figure (3).? ? ? ? Arguments:

? ? x -- Input data for every time-step, of shape (n_x, m, T_x).

????a0 -- Initial hidden state, of shape (n_a, m)

????parameters -- python dictionary containing:

????Waa -- Weight matrix multiplying the hidden state, numpy array of shape (n_a, n_a)

????Wax -- Weight matrix multiplying the input, numpy array of shape (n_a, n_x)

????Wya -- Weight matrix relating the hidden-state to the output, numpy array of shape (n_y, n_a)

????ba -- Bias numpy array of shape (n_a, 1)

????by -- Bias relating the hidden-state to the output, numpy array of shape (n_y, 1)

????Returns:

????a -- Hidden states for every time-step, numpy array of shape (n_a, m, T_x)

????y_pred -- Predictions for every time-step, numpy array of shape (n_y, m, T_x)

????caches -- tuple of values needed for the backward pass, contains (list of caches, x)

"""

# Initialize "caches" which will contain the list of all caches

caches = []

# Retrieve dimensions from shapes of x and parameters["Wya"]

n_x, m, T_x = x.shape

n_y,n_a = parameters["Wya"].shape

### START CODE HERE ###

# initialize "a" and "y_pred" with zeros (≈2 lines)

a = np.zeros((n_a, m, T_x))

y_pred = np.zeros((n_y, m, T_x))

# Initialize a_next (≈1 line)

a_next = a0

# loop over all time-steps

for t in range(T_x):

????# Update next hidden state, compute the prediction, get the cache (≈1 line)

????a_next, yt_pred, cache = rnn_cell_forward(x[:,:,t] ,a_next, parameters)

????# Save the value of the new "next" hidden state in a (≈1 line)

????a[:,:,t] = a_next

????# Save the value of the prediction in y (≈1 line)

????y_pred[:,:,t] = yt_pred

????# Append "cache" to "caches" (≈1 line)

????caches.append(cache)

### END CODE HERE

### # store values needed for backward propagation in cache

caches = (caches, x)

return a, y_pred, caches

執(zhí)行上述代碼

def?rnn_forward_test(rnn_forward) :
????np.random.seed(1)

????x_tmp = np.random.randn(3, 10, 4)

????a0_tmp = np.random.randn(5, 10)

????parameters_tmp = {}

????parameters_tmp['Waa'] = np.random.randn(5, 5)

????parameters_tmp['Wax'] = np.random.randn(5, 3)

????parameters_tmp['Wya'] = np.random.randn(2, 5)

????parameters_tmp['ba'] = np.random.randn(5, 1)

????parameters_tmp['by'] = np.random.randn(2, 1)

????a_tmp, y_pred_tmp, caches_tmp = rnn_forward(x_tmp, a0_tmp, parameters_tmp)

????print("a[4][1] = \n", a_tmp[4][1])

????print("a.shape = \n", a_tmp.shape)

????print("y_pred[1][3] =\n", y_pred_tmp[1][3])

????print("y_pred.shape = \n", y_pred_tmp.shape)

????print("caches[1][1][3] =\n", caches_tmp[1][1][3])

????print("len(caches) = \n", len(caches_tmp))

#UNIT TEST? ?

rnn_forward_test(rnn_forward)

7. 小結(jié)

You've successfully built the forward propagation of a recurrent network from scratch.

??Situations when this RNN will peform better:

●?This will work well enough for some applications, but it suffers from vanishing gradients.

●?The RNN works best when each output? $\hat{y}^{<t>}$ ?can be estimated using "local" context.

●?"Local" context refers? to information that is close to the prediction's time step t.

●? More formally, local context refers to inputs? $x^{<t_j> }$ and predictions? $\hat{y}^{<t>}$ ?where is $t_j$ close to? $t$

??What you should remember:

●?The recurrent neural network, or RNN , is essentially the repeated use of a single cell.

●?A basic RNN reads inputs one at a time, and remembers information through the hidden layer activations(hidden states) that are passed from one step to the next

? ? ? ■?The timestep dimension determines how many times to re-use the RNN cell

● Each cell takes into two inputs at each time step:

? ? ? ■? The hidden state from the previous cell

? ? ? ■? ?The current time step's input data

●?Each cell has two outputs at each time step:

? ? ? ■? ?A hidden state

? ? ? ■? ?A prediction

最后編輯于：2023.09.24 14:30:14

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者

人面猴
序言：七十年代末报破，一起剝皮案震驚了整個(gè)濱河市郁季，隨后出現(xiàn)的幾起案子个唧，更是在濱河造成了極大的恐慌篇恒，老刑警劉巖，帶你破解...
沈念sama閱讀 206,839評(píng)論 6贊 482
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件滑燃，死亡現(xiàn)場(chǎng)離奇詭異筑公，居然都是意外死亡，警方通過查閱死者的電腦和手機(jī)车胡，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 88,543評(píng)論 2贊 382
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門檬输，熙熙樓的掌柜王于貴愁眉苦臉地迎上來，“玉大人匈棘，你說我怎么就攤上這事丧慈。” “怎么了主卫？”我有些...
開封第一講書人閱讀 153,116評(píng)論 0贊 344
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵逃默，是天一觀的道長(zhǎng)。經(jīng)常有香客問我簇搅，道長(zhǎng)完域，這世上最難降的妖魔是什么？我笑而不...
開封第一講書人閱讀 55,371評(píng)論 1贊 279
?港島之戀（遺憾婚禮）
正文為了忘掉前任瘩将，我火速辦了婚禮吟税，結(jié)果婚禮上凹耙，老公的妹妹穿的比我還像新娘。我一直安慰自己肠仪，他們只是感情好肖抱，可當(dāng)我...
茶點(diǎn)故事閱讀 64,384評(píng)論 5贊 374
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布。她就那樣靜靜地躺著异旧，像睡著了一般意述。火紅的嫁衣襯著肌膚如雪。梳的紋絲不亂的頭發(fā)上吮蛹，一...
開封第一講書人閱讀 49,111評(píng)論 1贊 285
城市分裂傳說
那天荤崇，我揣著相機(jī)與錄音，去河邊找鬼潮针。笑死术荤，一個(gè)胖子當(dāng)著我的面吹牛，可吹牛的內(nèi)容都是我干的然低。我是一名探鬼主播喜每，決...
沈念sama閱讀 38,416評(píng)論 3贊 400
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼，長(zhǎng)吁一口氣：“原來是場(chǎng)噩夢(mèng)啊……” “哼雳攘！你這毒婦竟也來了？” 一聲冷哼從身側(cè)響起枫笛，我...
開封第一講書人閱讀 37,053評(píng)論 0贊 259
萬榮殺人案實(shí)錄
序言：老撾萬榮一對(duì)情侶失蹤吨灭，失蹤者是張志新（化名）和其女友劉穎，沒想到半個(gè)月后刑巧，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體喧兄，經(jīng)...
沈念sama閱讀 43,558評(píng)論 1贊 300
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡，尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 36,007評(píng)論 2贊 325
?白月光啟示錄
正文我和宋清朗相戀三年啊楚，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了吠冤。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
茶點(diǎn)故事閱讀 38,117評(píng)論 1贊 334
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡恭理，死狀恐怖拯辙，靈堂內(nèi)的尸體忽然破棺而出，到底是詐尸還是另有隱情颜价，我是刑警寧澤涯保，帶...
沈念sama閱讀 33,756評(píng)論 4贊 324
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站周伦，受9級(jí)特大地震影響夕春，放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜专挪，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 39,324評(píng)論 3贊 307
男人毒藥：我在死后第九天來索命
文/蒙蒙一及志、第九天我趴在偏房一處隱蔽的房頂上張望片排。院中可真熱鬧，春花似錦速侈、人聲如沸率寡。這莊子的主人今日做“春日...
開封第一講書人閱讀 30,315評(píng)論 0贊 19
一樁弒父案锌畸，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽勇劣。三九已至，卻和暖如春潭枣，著一層夾襖步出監(jiān)牢的瞬間比默，已是汗流浹背。一陣腳步聲響...
開封第一講書人閱讀 31,539評(píng)論 1贊 262
情欲美人皮
我被黑心中介騙來泰國打工盆犁，沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留命咐，地道東北人。一個(gè)月前我還...
沈念sama閱讀 45,578評(píng)論 2贊 355
代替公主和親
正文我出身青樓谐岁，卻偏偏與公主長(zhǎng)得像醋奠，于是被迫代替她去往敵國和親。傳聞我的和親對(duì)象是個(gè)殘疾皇子伊佃，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 42,877評(píng)論 2贊 345