誤差反向傳播法
計(jì)算圖
- 用計(jì)算圖解題
- 構(gòu)建計(jì)算圖
- 在計(jì)算圖上,從左向右進(jìn)行計(jì)算(正向傳播)
- 局部計(jì)算
- 通過傳遞"局部計(jì)算"獲得最終結(jié)果.
- 局部計(jì)算指無論全局發(fā)生了什么,都能只根據(jù)與自己相關(guān)的信息輸出接下來的結(jié)果
- 反向傳播
- 反向傳播傳遞"局部導(dǎo)數(shù)"
鏈?zhǔn)椒▌t
- 復(fù)合函數(shù):由多個(gè)函數(shù)構(gòu)成的函數(shù)
- 如果某個(gè)函數(shù)由復(fù)合函數(shù)表示,則該復(fù)合函數(shù)的導(dǎo)數(shù)可以用構(gòu)成復(fù)合函數(shù)的各個(gè)函數(shù)的導(dǎo)數(shù)的乘積表示
反向傳播
加法節(jié)點(diǎn)的反向傳播
- 加法節(jié)點(diǎn)的反向傳播只乘以1,輸入的值原封不動(dòng)地流向下一個(gè)節(jié)點(diǎn)
乘法節(jié)點(diǎn)的反向傳播
- 乘法的反向傳播會(huì)將上游的值乘以正向傳播時(shí)的輸入信號(hào)的"翻轉(zhuǎn)值"后傳遞給下游.翻轉(zhuǎn)值表示一種翻轉(zhuǎn)關(guān)系
- 實(shí)現(xiàn)乘法節(jié)點(diǎn)的反向傳播時(shí),要保存正向傳播的輸入信號(hào)
簡單層的實(shí)現(xiàn)
#乘法層的實(shí)現(xiàn)
class MulLayer:
def __init__(self):
self.x = None
self.y = None
def forward(self,x,y):
self.x = x
self.y = y
out = x * y
return out
def backward(self,dout):
dx = dout * self.y
dy = dout * self.x
return dx,dy
#加法層的實(shí)現(xiàn)
class AddLayer:
def __init__(self):
pass
def forward(self,x,y):
out = x+y
return out
def backward(self,dout):
dx = dout * 1
dy = dout * 1
return dx,dy
#實(shí)現(xiàn)購買2個(gè)蘋果和3個(gè)橘子的例子
apple = 100
apple_num = 2
orange = 150
orange_num = 3
tax = 1.1
#layer
mul_apple_layer = MulLayer()
mul_orange_layer = MulLayer()
add_apple_orange_layer = AddLayer()
mul_tax_layer = MulLayer()
#forward
apple_price = mul_apple_layer.forward(apple,apple_num)
orange_price = mul_orange_layer.forward(orange,orange_num)
all_price = add_apple_orange_layer.forward(apple_price,orange_price)
price = mul_tax_layer.forward(all_price,tax)
#backward
dprice = 1
dall_price,dtax = mul_tax_layer.backward(dprice)
dapple_price,dorange_price = add_apple_orange_layer.backward(dall_price)
dorange,dorange_num = mul_orange_layer.backward(dorange_price)
dapple,dapple_num = mul_apple_layer.backward(dapple_price)
print(price)
print(dapple_num,dapple,dorange,dorange_num,dtax)
715.0000000000001
110.00000000000001 2.2 3.3000000000000003 165.0 650
激活函數(shù)層的實(shí)現(xiàn)
ReLU層
- 正向傳播時(shí)的輸入
大于0,則反向傳播會(huì)將上游的值原封不動(dòng)地傳給下游.
- 正向傳播時(shí)的
小于等于0,則反向傳播中傳給下游的信號(hào)將停在此處
#ReLu層的實(shí)現(xiàn)
#forward和backward的參數(shù)為numpy數(shù)組
class Relu:
def __init__(self):
self.mask = None
def forward(self,x):
self.mask = (x<=0)
out = x.copy()
out[self.mask]=0
return out
def backward(self,dout):
dout[self.mask]=0
dx = dout
return dx
Sigmoid層
- 正向傳播
- "x"節(jié)點(diǎn):
- "exp"節(jié)點(diǎn):
- "+"節(jié)點(diǎn):
- "/"節(jié)點(diǎn):
- "x"節(jié)點(diǎn):
- 反向傳播
- "/"節(jié)點(diǎn)
- "/"節(jié)點(diǎn)
* 反向傳播時(shí),將上游的值乘以$-y^2$,再傳給下游
- "+"節(jié)點(diǎn)將上游的值原封不動(dòng)地傳給下游
- "exp"節(jié)點(diǎn)
- "x節(jié)點(diǎn)"將正向傳播時(shí)的值翻轉(zhuǎn)后做乘法運(yùn)算
- 反向傳播簡潔版
#sigmoid層實(shí)現(xiàn)
class Sigmoid:
def __init__(self):
self.out = None
def forward(self,x):
out = 1/(1+np.exp(-x))
self.out = out
return out
def backward(self,dout):
dx = dout*(1.0-self.out)*self.out
return dx
Affine/Softmax層的實(shí)現(xiàn)
Affine層
- 神經(jīng)網(wǎng)絡(luò)的正向傳播中進(jìn)行的矩陣的乘積運(yùn)算再幾何學(xué)領(lǐng)域被稱為"仿射變換".
- 進(jìn)行仿射變換的處理實(shí)現(xiàn)為"Affine層"
- 每個(gè)節(jié)點(diǎn)間傳播的是矩陣
#Affine層實(shí)現(xiàn)
class Affine:
def __init__(self,W,b):
self.W = W
self.b = b
self.x = None
self.dW = None
self.db = None
def forward(self,x):
self.x = x
out = np.dot(x,self.W)+self.b
return out
def backward(self,dout):
dx = np.dot(dout,self.W.T)
self.dW = np.dot(self.x.t,dout)
self.db = np.sum(dout,axis=0)
return dx
Softmax-with-Loss層
P5-29.png
P5-30.png
- Softmax層的反向傳播得到了(
)的結(jié)果.由于(
是Softmax層的輸出,(
)是監(jiān)督數(shù)據(jù),所以(
)是Softmax層的輸出和監(jiān)督標(biāo)簽的差分.<font color="red">神經(jīng)網(wǎng)絡(luò)的反向傳播會(huì)把這個(gè)差分表示的誤差傳遞給前面的層.</font>這是神經(jīng)網(wǎng)絡(luò)學(xué)習(xí)中的重要性質(zhì)
from sourcecode.common.functions import cross_entropy_error
#Softmax-with-Loss層的實(shí)現(xiàn)
class SoftmaxWithLoss:
def __init__(self):
self.loss = None #損失
self.y = None #softmax的輸出
self.t = None #監(jiān)督數(shù)據(jù)(ont-hot vector)
def forward(self,x,t):
self.t = t
self.y = softmax(x)
self.loss = cross_entropy_error(self.y,self.t)
return self.loss
def backward(self,dout=1):
batch_size = self.t.shape[0]
dx = (self.y-self.t)/batch_size
return dx#向前傳遞的是單個(gè)數(shù)據(jù)的誤差