Introduction
Perceptron is a single layer neural network and a multi-layer perceptron is called Neural Networks. It is a supervised learning of binary linear classifier which means that the traing dataset must be linear separable. 感知器是神經(jīng)網(wǎng)絡(luò)最簡單的一種形式.同時(shí)它是一個(gè)簡易的二分線性分類器,意味被訓(xùn)練數(shù)據(jù)集必須線性可分姑食。
The perceptron consists of 4 parts .
- Input : n-dimensional feature vector
- Output : {-1,1}
- Weights, Bias, NetSum
- Activation Function
How does it work?
Definitions: sign(z) = 1 if z ≥ 0, ?1 otherwise.
Inputs: Training examples (Xt,Yt) for t ∈ {1 . . . n} where x ∈ R d is an input, and yt ∈ {?1, +1} is a label.
Initialization: w = [0,0,...0,], b = 0, rate = 0.0001
Algorithm:
while there exist y*sign(w·x+b) <= 0 do
x,y = random pick from misclassification set M
w <- w + rate*y*x
b <- b + rate*y
end
Loss Function
損失函數(shù)的自然選擇是誤分類點(diǎn)的總數(shù),顯然這個(gè)函數(shù)不是連續(xù)可導(dǎo)的序芦,不易優(yōu)化圃郊。
所以,一般感知機(jī)采用是誤分類點(diǎn)到超平面S的總距離。
Convergence Proof for the Perceptron Algorithm
Assumption:
is unit normal vector of the perfect separable line.
R is the maximum Euclidean length of
the minimum distatnce of any examples to the plane
Claim 1: wk+1·w ≥ wk·w +
That is, every time we make a mistake, the dot-product of our weight vector with the target increases by at least .
Since w* is the target perfect function hence every correctly away from the line, we can conclude that:
Proof:
Claim 1 implies that after M mistakes,
.
Claim 2: ||wk+1||2 ≤ ||wk||2 + R2
That is, every time we make a mistake, the length squared of our weight vector increases by at most R2.
Note that, we only update changed only when mistake, so we can have:
Proof:
Mistakes limit
growth, even when updating with the longest
Because
, so
On the other hand, Claim 2 implies that after M mistakes,
.
Convergence Proof
Combing the bounds together, it gives
Since is a unit-length vector,
then, it comes to
Perceptron in Python
class Perceptron(object):
def __init__(self, input_num):
'''
初始化感知器浦辨,設(shè)置輸入?yún)?shù)的個(gè)數(shù)
'''
# 權(quán)重向量初始化為0
self.weights = np.array([0.0 for _ in range(input_num)])
# 偏置項(xiàng)初始化為0
self.bias = 0.0
self.iteration = 0
def __str__(self):
'''
打印學(xué)習(xí)到的權(quán)重、偏置項(xiàng)
'''
return 'interation:%s\nweights\t:%s\nbias\t:%f\n' % (self.iteration, self.weights, self.bias)
def predict(self, input_vecs):
'''
輸入向量沼沈,輸出感知器的計(jì)算結(jié)果
'''
return np.array(list(map(lambda x: np.dot(self.weights,x) + self.bias,input_vecs)))
def train(self, input_vecs, labels, rate):
'''
輸入訓(xùn)練數(shù)據(jù):一組向量流酬、與每個(gè)向量對應(yīng)的label币厕、學(xué)習(xí)率
'''
flag = True
while(flag):
'''
一次迭代,把所有的訓(xùn)練數(shù)據(jù)過一遍
'''
outputs = self.predict(input_vecs)
#初始化誤分類點(diǎn)集合M
M = []
for data in zip(input_vecs,outputs,labels):
if data[1]*data[2] <= 0:
M.append(data)
if len(M) == 0:
#如果訓(xùn)練集中沒有誤分類點(diǎn)芽腾,則停止訓(xùn)練
flag = False
else:
self.iteration += 1
#隨機(jī)挑選M中的點(diǎn)旦装,
if len(M) == 1:
index = 0
else:
index = np.random.randint(0,len(M)-1)
self._update_weights(M[index][0],M[index][1],M[index][2],rate)
draw(input_vecs,labels,self.weights,self.bias)
def _update_weights(self, input_vec, output, label, rate):
'''
按照感知器規(guī)則更新權(quán)重
'''
# 更新weights
self.weights += rate* label* input_vec
# 更新bias
self.bias += rate * label
def accuracy_score(self, input_vecs, labels):
count = 0
outputs = self.predict(input_vecs)
for output,label in zip(outputs,labels):
if output*label <= 0:
count += 1
return count/len(labels)*100