2023-09-24 01 RNN 的前向傳播

來源:https://hyunhp.tistory.com/448

1. RNN cell 與 RNN 直觀圖示

RNN ---->? Recurrent Neural Network?

You can think of the recurrent neural network as the repeated use of a single cell忱详,the computations for a single time step.?

2. 輸入的維度Dimensions of input x

2.1 Input with?n_{x}number of units

? ? For a single time step of a single input example,?x^{(i)<t>}?is a one-dimensional input vector

?? ?Using language as an example, a language with a 5000-word vocabulary could be one-hot encoded into a vector that has n_{x}=5000?units. so?x^{(i)<t>}?could have the shape (5000,)

?? The notation?n_{x}?is used here to denote the number of units in a single time step of a single training example

2.2 Time Steps of size?T_{x}

? A recurrent neural network has multiple time steps, which you'll be index with t.

??In the lessons, you saw a single training example?x^{(i)}consisting of multiple time steps?T_{x}. In this notebook,?T_{x}will denote the number of timesteps in the longest sequence.

2.3 Batches of size m

? ?Let's say we have mini-batches, each with 20 training examples

? ?To benefit from vectorization, you'll stack 20 columns of?x^{(i)}examples

? ?For example, the tensor has the shape (5000,20,10)

? ?You'll use m to denote the number of training examples

? ?So, the shape of a mini-batch is

2.4 3D Tensor of shape?(n_{x},m,T_{x})

? ?The 3-dimensional tensor x of shape?(n_{x},m,T_{x})?represents the input x that is fed into the RNN

2.5 Take a 2D slice for each time step:?x^{<t>}

?? At each time step, you'll use a mini-batch of training examples (not just a single example)

? So, for each time step t, you'll use a 2D slice of shape (n_{x},m)

? This 2D slice is referred to as?x^{t}. The variable name in the code is xt.

3. 隱藏狀態(tài)的維度 hidden state a

the activation?a^{<t>}?that is passed to the RNN from one time step to another is called a "hidden state"

3.1 Dimensions of hidden state a

??Similar to the input tensor x, the hidden state for a single training example is a vector of length?

?? If you include a mini-batch or m training examples, the shape of a mini-batch is?(n_{a},m)

? When you include the time step dimension, the shape of the hidden state is?(n_{a},m,T_{x})

? You'll loop through the time steps with index t, and work with 2 2D slice of the 3D tensor

? This 2D slice is referred to as?a^{<t>}

?? In the code, the variable names used are either a_prev or a_next, depending on the function being implemented

?? The shape of this 2D slice is?(n_{a},m)

4. 輸出的維度Dimensions of prediction?\hat{y}

??Similar to the inputs? and hidden states,?\hat{y} ?is a 3D tensor of shape?(n_{y},m,T_{y})

????????????■?n_{y}?: number of units in the vector representing the prediction

????????????■ m :? ? number of examples in a mini-batch

????????????■?T_{y}:? number of time steps in the prediction

??For a single similar time step t, a 2D slice?\hat{y} ^{<t>}?has shape?(n_{y},m)

??In the code, the varriable names are:

? ? ? ? ? ? ?●? y_pred :?\hat{y}

? ? ? ? ? ? ?●? yt_pred :?\hat{y} ^{<t>}

5. 構(gòu)建RNN

?? Here is how you can implement an RNN:

Steps:

????????????● Implement the calculations needed for one time step of the RNN.

????????????● Implement a loop over T_{x} time steps in order to process all the inputs, one at a time

?? 關(guān)于 RNN Cell

You can think of the recurrent neural network as the repeated use of a single cell. First, you'll implement the computations for a single time step.?

?? RNN cell? versus RNN_cell_forward:

● Note that an RNN cell outputs the hidden state?a^{<t>}

? ? ? ■?RNN cell is shown in the figure as the inner box with solid lines

●?The function that you'll implement, rnn_cell_forward, also calculates the prediction?\hat{y} ^{<t>}

? ? ?■ RNN_cell_forward is shown in the figure as the outer? ox with dashed lines

??The following figure describes the operations for a single time step of an RNN cell:

代碼如下:

# UNQ_C1 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)

# GRADED FUNCTION: rnn_cell_forward

def rnn_cell_forward(xt, a_prev, parameters):?

? ? ? """

? ? ?【代碼注釋】

? ? ? ?Implements a single forward step of the RNN-cell as described in Figure (2)

? ? ? ?Arguments:

? ? ? ?xt -- your input data at timestep "t", numpy array of shape (n_x, m).

? ? ? ?a_prev -- Hidden state at timestep "t-1", numpy array of shape (n_a, m)

? ? ? ?parameters -- python dictionary containing:

? ? ? ? ? ? ? ? ? ? ? ? Wax -- Weight matrix multiplying the input, numpy array of shape (n_a, n_x)

? ? ? ? ? ? ? ? ? ? ? ? Waa -- Weight matrix multiplying the hidden state, numpy array of shape (n_a, n_a)

? ? ? ? ? ? ? ? ? ? ? ? Wya -- Weight matrix relating the hidden-state to the output, numpy array of shape (n_y, n_a)

? ? ? ? ? ? ? ? ? ? ? ? ba --? Bias, numpy array of shape (n_a, 1)

? ? ? ? ? ? ? ? ? ? ? ? by -- Bias relating the hidden-state to the output, numpy array of shape (n_y, 1)

? ? ? ? Returns:

? ? ? ? a_next -- next hidden state, of shape (n_a, m)

? ? ? ? yt_pred -- prediction at timestep "t", numpy array of shape (n_y, m)

? ? ? ? cache -- tuple of values needed for the backward pass, contains (a_next, a_prev, xt, parameters)

? ? ? ? """

? ? # Retrieve parameters from "parameters"? ?

? ? Wax = parameters["Wax"]

? ? Waa = parameters["Waa"]

? ? Wya = parameters["Wya"]

? ? ba = parameters["ba"]

? ? by = parameters["by"]


? ? ### START CODE HERE ### (≈2 lines)? ?

? ? # compute next activation state using the formula given above? ?

? ? a_next = np.tanh(np.dot(Wax, xt) + np.dot(Waa, a_prev) + ba)

? ? # compute output of the current cell using the formula given above? ?

? ? yt_pred = softmax(np.dot(Wya, a_next) + by)

? ? ### END CODE HERE ###? ?

? ? # store values you need for backward propagation in cache? ?

? ? cache = (a_next, a_prev, xt, parameters)

? ? return a_next, yt_pred, cache

執(zhí)行上述代碼

def?rnn_cell_forward_tests(rnn_cell_forward):

????????np.random.seed(1)

????????xt_tmp = np.random.randn(3, 10)

????????a_prev_tmp = np.random.randn(5, 10)

????????parameters_tmp = {}

????????parameters_tmp['Waa'] = np.random.randn(5, 5)

????????parameters_tmp['Wax'] = np.random.randn(5, 3)

????????parameters_tmp['Wya'] = np.random.randn(2, 5)

????????parameters_tmp['ba'] = np.random.randn(5, 1)

????????parameters_tmp['by'] = np.random.randn(2, 1)

????????a_next_tmp, yt_pred_tmp, cache_tmp = rnn_cell_forward(xt_tmp, a_prev_tmp, parameters_tmp)

????????print("a_next[4] = \n", a_next_tmp[4])

????????print("a_next.shape = \n", a_next_tmp.shape)

????????print("yt_pred[1] =\n", yt_pred_tmp[1])

????????print("yt_pred.shape = \n", yt_pred_tmp.shape)

# UNIT TESTS

rnn_cell_forward_tests(rnn_cell_forward)

6. RNN前向傳播的過程 RNN Forward Pass

? A recurrent neural network (RNN) is repetition of the RNN cell that you've just built.

? ? ? ● If your input sequence of data is 10 time steps long, then you will re-use the RNN cell 10 times

? Each cell takes two inputs at each time step:

? ? ? ●?a^{<t-1>}: The hidden state from the previous cell

? ? ? ● x^{<t>}?:??The current time step's input data

? It has two outputs at each time step:

? ? ? ●? A hidden state?(a^{<t>})

? ? ??●? A prediction?(y^{<t>})

? The weights biases?(W_{aa},b_{a},W_{ax},b_{x})?are resued each time step

? ? ? ●? They are maintained between calls to rnn_cell_forward in the 'parameters' dictionary

?? 上面代碼里面沒有提b_{x}

# UNQ_C2 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)

# GRADED FUNCTION: rnn_forward

def rnn_forward(x, a0, parameters):

? ? """ Implement the forward propagation of the recurrent neural network described in Figure (3).? ? ? ? Arguments:

? ? x -- Input data for every time-step, of shape (n_x, m, T_x).

????a0 -- Initial hidden state, of shape (n_a, m)

????parameters -- python dictionary containing:

????Waa -- Weight matrix multiplying the hidden state, numpy array of shape (n_a, n_a)

????Wax -- Weight matrix multiplying the input, numpy array of shape (n_a, n_x)

????Wya -- Weight matrix relating the hidden-state to the output, numpy array of shape (n_y, n_a)

????ba -- Bias numpy array of shape (n_a, 1)

????by -- Bias relating the hidden-state to the output, numpy array of shape (n_y, 1)

????Returns:

????a -- Hidden states for every time-step, numpy array of shape (n_a, m, T_x)

????y_pred -- Predictions for every time-step, numpy array of shape (n_y, m, T_x)

????caches -- tuple of values needed for the backward pass, contains (list of caches, x)

"""

# Initialize "caches" which will contain the list of all caches

caches = []

# Retrieve dimensions from shapes of x and parameters["Wya"]

n_x, m, T_x = x.shape

n_y,n_a = parameters["Wya"].shape

### START CODE HERE ###

# initialize "a" and "y_pred" with zeros (≈2 lines)

a = np.zeros((n_a, m, T_x))

y_pred = np.zeros((n_y, m, T_x))

# Initialize a_next (≈1 line)

a_next = a0

# loop over all time-steps

for t in range(T_x):

????# Update next hidden state, compute the prediction, get the cache (≈1 line)

????a_next, yt_pred, cache = rnn_cell_forward(x[:,:,t] ,a_next, parameters)

????# Save the value of the new "next" hidden state in a (≈1 line)

????a[:,:,t] = a_next

????# Save the value of the prediction in y (≈1 line)

????y_pred[:,:,t] = yt_pred

????# Append "cache" to "caches" (≈1 line)

????caches.append(cache)

### END CODE HERE

### # store values needed for backward propagation in cache

caches = (caches, x)

return a, y_pred, caches

執(zhí)行 上述代碼

def?rnn_forward_test(rnn_forward) :
????np.random.seed(1)

????x_tmp = np.random.randn(3, 10, 4)

????a0_tmp = np.random.randn(5, 10)

????parameters_tmp = {}

????parameters_tmp['Waa'] = np.random.randn(5, 5)

????parameters_tmp['Wax'] = np.random.randn(5, 3)

????parameters_tmp['Wya'] = np.random.randn(2, 5)

????parameters_tmp['ba'] = np.random.randn(5, 1)

????parameters_tmp['by'] = np.random.randn(2, 1)

????a_tmp, y_pred_tmp, caches_tmp = rnn_forward(x_tmp, a0_tmp, parameters_tmp)

????print("a[4][1] = \n", a_tmp[4][1])

????print("a.shape = \n", a_tmp.shape)

????print("y_pred[1][3] =\n", y_pred_tmp[1][3])

????print("y_pred.shape = \n", y_pred_tmp.shape)

????print("caches[1][1][3] =\n", caches_tmp[1][1][3])

????print("len(caches) = \n", len(caches_tmp))

#UNIT TEST? ?

rnn_forward_test(rnn_forward)

7. 小結(jié)

You've successfully built the forward propagation of a recurrent network from scratch.

??Situations when this RNN will peform better:

●?This will work well enough for some applications, but it suffers from vanishing gradients.

●?The RNN works best when each output?\hat{y}^{<t>} ?can be estimated using "local" context.

●?"Local" context refers? to information that is close to the prediction's time step t.

●? More formally, local context refers to inputs?x^{<t_j> }and predictions?\hat{y}^{<t>} ?where is t_jclose to?t

??What you should remember:

●?The recurrent neural network, or RNN , is essentially the repeated use of a single cell.

●?A basic RNN reads inputs one at a time, and remembers information through the hidden layer activations(hidden states) that are passed from one step to the next

? ? ? ■?The timestep dimension determines how many times to re-use the RNN cell

● Each cell takes into two inputs at each time step:

? ? ? ■? The hidden state from the previous cell

? ? ? ■? ?The current time step's input data

●?Each cell has two outputs at each time step:

? ? ? ■? ?A hidden state

? ? ? ■? ?A prediction

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末报破,一起剝皮案震驚了整個(gè)濱河市郁季,隨后出現(xiàn)的幾起案子个唧,更是在濱河造成了極大的恐慌篇恒,老刑警劉巖,帶你破解...
    沈念sama閱讀 206,839評(píng)論 6 482
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件滑燃,死亡現(xiàn)場(chǎng)離奇詭異筑公,居然都是意外死亡,警方通過查閱死者的電腦和手機(jī)车胡,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 88,543評(píng)論 2 382
  • 文/潘曉璐 我一進(jìn)店門檬输,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人匈棘,你說我怎么就攤上這事丧慈。” “怎么了主卫?”我有些...
    開封第一講書人閱讀 153,116評(píng)論 0 344
  • 文/不壞的土叔 我叫張陵逃默,是天一觀的道長(zhǎng)。 經(jīng)常有香客問我簇搅,道長(zhǎng)完域,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 55,371評(píng)論 1 279
  • 正文 為了忘掉前任瘩将,我火速辦了婚禮吟税,結(jié)果婚禮上凹耙,老公的妹妹穿的比我還像新娘。我一直安慰自己肠仪,他們只是感情好肖抱,可當(dāng)我...
    茶點(diǎn)故事閱讀 64,384評(píng)論 5 374
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著异旧,像睡著了一般意述。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上吮蛹,一...
    開封第一講書人閱讀 49,111評(píng)論 1 285
  • 那天荤崇,我揣著相機(jī)與錄音,去河邊找鬼潮针。 笑死术荤,一個(gè)胖子當(dāng)著我的面吹牛,可吹牛的內(nèi)容都是我干的然低。 我是一名探鬼主播喜每,決...
    沈念sama閱讀 38,416評(píng)論 3 400
  • 文/蒼蘭香墨 我猛地睜開眼,長(zhǎng)吁一口氣:“原來是場(chǎng)噩夢(mèng)啊……” “哼雳攘!你這毒婦竟也來了?” 一聲冷哼從身側(cè)響起枫笛,我...
    開封第一講書人閱讀 37,053評(píng)論 0 259
  • 序言:老撾萬榮一對(duì)情侶失蹤吨灭,失蹤者是張志新(化名)和其女友劉穎,沒想到半個(gè)月后刑巧,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體喧兄,經(jīng)...
    沈念sama閱讀 43,558評(píng)論 1 300
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 36,007評(píng)論 2 325
  • 正文 我和宋清朗相戀三年啊楚,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了吠冤。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點(diǎn)故事閱讀 38,117評(píng)論 1 334
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡恭理,死狀恐怖拯辙,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情颜价,我是刑警寧澤涯保,帶...
    沈念sama閱讀 33,756評(píng)論 4 324
  • 正文 年R本政府宣布,位于F島的核電站周伦,受9級(jí)特大地震影響夕春,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜专挪,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 39,324評(píng)論 3 307
  • 文/蒙蒙 一及志、第九天 我趴在偏房一處隱蔽的房頂上張望片排。 院中可真熱鬧,春花似錦速侈、人聲如沸率寡。這莊子的主人今日做“春日...
    開封第一講書人閱讀 30,315評(píng)論 0 19
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽勇劣。三九已至,卻和暖如春潭枣,著一層夾襖步出監(jiān)牢的瞬間比默,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 31,539評(píng)論 1 262
  • 我被黑心中介騙來泰國打工盆犁, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留命咐,地道東北人。 一個(gè)月前我還...
    沈念sama閱讀 45,578評(píng)論 2 355
  • 正文 我出身青樓谐岁,卻偏偏與公主長(zhǎng)得像醋奠,于是被迫代替她去往敵國和親。 傳聞我的和親對(duì)象是個(gè)殘疾皇子伊佃,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 42,877評(píng)論 2 345

推薦閱讀更多精彩內(nèi)容