從0開始實(shí)現(xiàn)邏輯回歸算法(LogicRegression)
邏輯回歸(LR)算法是一個比較常見的二元分類算法柔昼,通常只預(yù)測正例的概率漱办,如給定一個樣本x即供,預(yù)測出來的結(jié)果為0.4,那么表示方法就是p(y=1|x)=0.4尼啡,也就是說在給定樣本x的情況下,通過LR預(yù)測出來正例的概率為0.4询微,反之,為負(fù)例的概率為0.6崖瞭,即p(y=0|x)=0.6。
邏輯回歸的數(shù)學(xué)表示為Y_hat=sigmoid(X*W+b)撑毛,函數(shù)原型和線性模型很相似书聚,實(shí)質(zhì)上LR本質(zhì)上是一個線性模型,可以從廣義線性模型和伯努利分布進(jìn)行推導(dǎo)這個模型藻雌,本文就不做推導(dǎo)了雌续。其實(shí)有一個問題一直擺在很多初學(xué)者的面前,為啥公式是這個樣子的胯杭,為啥不是其它的驯杜。其實(shí),機(jī)器學(xué)習(xí)的目標(biāo)是什么做个,是找到一個參數(shù)鸽心,我們輸入樣本,輸出結(jié)果居暖。那么最簡單的表示是通過一個式子來表示我們的這個過程顽频。理論上,總有一個公式來擬合我們的數(shù)據(jù)太闺,比如牛頓定律F=ma糯景,其實(shí)也可以理解為一個模型,參數(shù)為a省骂,質(zhì)量m為樣本莺奸,那么受到的力為F,F(xiàn)就是我們的目標(biāo)冀宴。LR這個公式也可以這么理解灭贷。
想要實(shí)現(xiàn)LR并不難,主要要理解cost function和梯度的算法略贮。如果用tensorflow這類的框架甚疟,甚至不用求梯度仗岖,只用給出cost function即可。下面我將給出LR的實(shí)現(xiàn)代碼览妖,這個代碼是可以正常工作的轧拄,main函數(shù)就是用iris數(shù)據(jù)集進(jìn)行的測試。
from sklearn import datasets
from sklearn import metrics
import matplotlib.pyplot as plt
import numpy as np
def softmax(X):
return (np.exp(X) / (np.exp(X).sum(axis=0)))
def sigmod(X):
return (1) / (1 + np.exp(-X))
def score(W, b, X_test, Y_test):
m = X_test.shape[0]
Y_ = predict(W, b, X_test)
Y2 = np.array([1 if i > 0.5 else 0 for i in Y_]).reshape(m, 1)
accuracy = metrics.accuracy_score(Y_test, Y2)
return accuracy
def cost_gradient_descent(X, Y, W, b, learning_rate, lamda):
Z = np.dot(X, W) + b
Y_ = sigmod(Z)
m = X.shape[0]
Y2 = np.array([1 if i > 0.5 else 0 for i in Y_]).reshape(m, 1)
accuracy = metrics.accuracy_score(Y, Y2)
# J = -(Y.T.dot(np.log(Y_)) + (1 - Y).T.dot(np.log(1 - Y_))).sum() / m
#
# W = W - (learning_rate *
# (1 / m) * (X.T.dot(Y_ - Y)) + 0)
J = -(Y.T.dot(np.log(Y_)) + (1 - Y).T.dot(np.log(1 - Y_))).sum() / \
m + lamda * (np.square(W).sum(axis=0)) * (1 / (2 * m))
W = W - (learning_rate *
(1 / m) * (X.T.dot(Y_ - Y)) + (1 / m) * W * lamda)
b = b - learning_rate * (1 / m) * ((Y_ - Y).sum(axis=0))
# b = b - (learning_rate * (1 / m)
# * ((Y_ - Y).sum(axis=0)) + (1 / m) * b * lamda)
# b一般不進(jìn)行正則化
return J, W, b, accuracy
def predict(W, b, X):
Z = np.dot(X, W) + b
Y_ = sigmod(Z)
m = X.shape[0]
Y2 = np.array([1 if i > 0.5 else 0 for i in Y_]).reshape(m, 1)
return Y2
def train(X, Y, iter_num=1000):
# define parameter
m = X.shape[0]
n = X.shape[1]
W = np.ones((n, 1))
b = 0
learning_rate = 0.01
lamda = 0.01
i = 0
J = []
Accuracy = []
while i < iter_num:
i = i + 1
j, W, b, accuracy = cost_gradient_descent(
X, Y, W, b, learning_rate, lamda)
J.append(j)
Accuracy.append(accuracy)
print("step:", i, "cost:", j, "accuracy:", accuracy)
print(W)
print(b)
plt.plot(J)
plt.plot(Accuracy)
plt.show()
return W, b
def main():
# construct data
iris = datasets.load_iris()
X, Y = iris.data, iris.target.reshape(150, 1)
X = X[Y[:, 0] < 2]
Y = Y[Y[:, 0] < 2]
train(X, Y, 100)
def test():
X = np.array([[1, 0.5], [1, 1.5], [2, 1], [3, 1]])
m = (X.shape[0])
n = (X.shape[1])
Y = np.array([0, 0, 1, 0]).reshape(m, 1)
print((Y.shape))
print(train(X, Y, 1000))
if __name__ == '__main__':
main()
# test()
運(yùn)行代碼將輸出如下:在64次迭代的時候就收斂了讽膏。代碼里面實(shí)現(xiàn)了參數(shù)的L2正則化檩电。
step: 62 cost: [ 0.33512973] accuracy: 0.97
step: 63 cost: [ 0.32701202] accuracy: 0.98
step: 64 cost: [ 0.31998367] accuracy: 1.0
step: 65 cost: [ 0.31388857] accuracy: 1.0
此代碼是可用代碼