感知機(jī)(perceptron)屬于判別模型,是神經(jīng)網(wǎng)絡(luò)與SVM的基礎(chǔ)媒吗,由 Rosenblatt 于1957年提出念搬。
1 假設(shè)空間
感知機(jī)的假設(shè)空間
2 目標(biāo)函數(shù)及其推導(dǎo)
2.1 數(shù)據(jù)集的線性可分性
數(shù)據(jù)集的線性可分性
2.2 目標(biāo)函數(shù)及其推導(dǎo)
損失函數(shù)-1
如果用誤分類點(diǎn)的總數(shù)來(lái)定義經(jīng)驗(yàn)損失函數(shù)瓷产,會(huì)因?yàn)椴豢蓪?dǎo)而難以優(yōu)化
損失函數(shù)-2
用誤分類點(diǎn)到超平面的距離來(lái)定義損失函數(shù)
所以倒慧,感知機(jī)的學(xué)習(xí)算法是誤分類點(diǎn)驅(qū)動(dòng)的按摘,L(w,b)非負(fù),如果沒(méi)有誤分類點(diǎn)纫谅,損失函數(shù)值為0院峡,而且誤分類點(diǎn)越少、或誤分類點(diǎn)離超平面越近系宜,損失函數(shù)值就越小。
3 優(yōu)化算法
3.1 原始形式
感知機(jī)使用隨機(jī)梯度下降
步驟:
(1)发魄、選取初始值w0 = 0, b0 = 0
(2)盹牧、在訓(xùn)練集中隨機(jī)選取數(shù)據(jù)(xi, yi)
(3)、如果yi(wxi + b) <= 0励幼,則執(zhí)行隨機(jī)梯度下降
(4)汰寓、轉(zhuǎn)至(2),直至訓(xùn)練集中沒(méi)有誤分類點(diǎn)
注:原始形式由于采用不同的初始值或隨機(jī)樣本點(diǎn)的不同苹粟,會(huì)得到不同的解有滑。
3.2 收斂性證明
第一步
第二步
第三步
第四步
手寫 Perceptron 算法并與 Sklearn 作比較
# -*- coding: utf-8 -*-
from __future__ import absolute_import, division, print_function
import numpy as np
import pandas as pd
import random as rd
import matplotlib.pyplot as plt
from sklearn.utils import shuffle
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Perceptron
class PerceptronModel(object):
def __init__(self, w_init, b_init, lr, max_iter):
self.w_init = w_init
self.b_init = b_init
self.lr = lr
self.max_iter = max_iter
def get_loss(self, x_train, y_train, w, b):
wrong_sample = (y_train.reshape(-1,1) * (x_train.dot(w) + b) <= 0).reshape(-1)
x_wrong = x_train[wrong_sample]
y_wrong = y_train[wrong_sample]
loss = -y_wrong.dot(x_wrong.dot(w) + b)
return loss[0]
def fit(self, x_train, y_train):
# 計(jì)算初始損失值
loss_init = self.get_loss(x_train, y_train, self.w_init, self.b_init)
loss_res = []; loss_res.append(loss_init)
# 保存參數(shù)
w_res = []; w_res.append(self.w_init)
b_res = []; b_res.append(self.b_init)
# 迭代優(yōu)化
for step in range(self.max_iter):
# 隨機(jī)抽樣一個(gè)樣本
idx = rd.sample(range(len(x_train)), 1)
x_, y_ = x_train[idx], y_train[idx]
# 當(dāng)出現(xiàn)誤判時(shí)進(jìn)行梯度下降
if y_ * (x_.dot(w_res[-1]) + b_res[-1]) <= 0:
w = w_res[-1] + (self.lr*y_*x_).T
b = b_res[-1] + self.lr*y_
else:
continue
# 更新?lián)p失值,保存參數(shù)
loss = self.get_loss(x_train, y_train, w, b)
loss_res.append(loss); w_res.append(w); b_res.append(b)
# 取損失值最小的參數(shù)為最優(yōu)參數(shù)
w_best = w_res[np.argmin(loss_res)]
b_best = b_res[np.argmin(loss_res)]
return loss_res, w_best, b_best
def predict(self, x, w, b):
y_pred = np.sign(x.dot(w) + b).reshape(-1)
return y_pred
def get_score(self, y_true, y_pred):
score = sum(y_true == y_pred) / len(y_true)
return score
if __name__ == "__main__":
# 構(gòu)造二分類數(shù)據(jù)集
N = 500
x1 = np.random.uniform(low=1, high=5, size=[N,2]) + np.random.randn(N, 2)*0.01
y1 = np.tile(-1, N)
x2 = np.random.uniform(low=5, high=10, size=[N,2]) + np.random.randn(N, 2)*0.01
y2 = np.tile(1, N)
x = np.concatenate([x1,x2], axis=0)
y = np.concatenate([y1,y2])
x, y = shuffle(x, y, random_state=0)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)
# 參數(shù)初始化
w_init = np.zeros([x.shape[1], 1]) + 0.001
b_init = 0
lr = 0.01
max_iter = 5000
# 運(yùn)行自寫算法
model = PerceptronModel(w_init, b_init, lr, max_iter)
loss_res, w_best, b_best = model.fit(x_train, y_train)
y_pred = model.predict(x_test, w_best, b_best)
score = model.get_score(y_test, y_pred)
print(f"PerceptronModel 預(yù)測(cè)準(zhǔn)確率:{score}")
fig, ax = plt.subplots(figsize=(8,6))
ax.plot(loss_res)
plt.xlabel("steps")
plt.ylabel("loss")
plt.title("Loss")
plt.show()
fig, ax = plt.subplots(figsize=(8,6))
x_axis = np.linspace(1, 10, 10)
y_axis = -(w_best[0]*x_axis + b_best) / w_best[1]
ax.plot(x_axis, y_axis, color="red")
ax.scatter(x=x_test[:,0], y=x_test[:,1], c=y_test, label=y_test)
plt.xlabel("x1")
plt.ylabel("x2")
plt.title("PerceptronModel")
plt.show()
# 收斂性驗(yàn)證
mod_res = []
for i in range(len(x_train)):
mod = np.sqrt(np.sum(x_train[i]**2))
mod_res.append(mod)
R = np.max(mod_res)
w_aug = np.append(w_best, b_best)
C = np.sqrt(np.sum(w_aug**2))
r = np.min(y_train.reshape(-1,1) * (x_train.dot(w_best) + b_best))
k = np.floor((R*C / r)**2)
print(f"理論上嵌削,基于當(dāng)前數(shù)據(jù)集毛好,算法的收斂上界為{k}輪")
# 運(yùn)行sklearn算法
clf = Perceptron(eta0=lr, fit_intercept=True, max_iter=max_iter, tol=0.001)
clf.fit(x_train, y_train)
w = clf.coef_[0]
b = clf.intercept_
y_pred = clf.predict(x_test)
score = sum(y_test == y_pred) / len(y_test)
print(f"PerceptronSklearn 預(yù)測(cè)準(zhǔn)確率:{score}")
fig, ax = plt.subplots(figsize=(8,6))
x_axis = np.linspace(1, 10, 10)
y_axis = -(w[0]*x_axis + b) / w[1]
ax.plot(x_axis, y_axis, color="red")
ax.scatter(x=x_test[:,0], y=x_test[:,1], c=y_test, label=y_test)
plt.xlabel("x1")
plt.ylabel("x2")
plt.title("PerceptronModel")
plt.show()
PerceptronModel 預(yù)測(cè)準(zhǔn)確率:1.0
理論上望艺,基于當(dāng)前數(shù)據(jù)集,算法的收斂上界為7262972.0輪
PerceptronSklearn 預(yù)測(cè)準(zhǔn)確率:1.0