4.8 scikit-learn中的Scaler

4.8 scikit-learn中的Scaler

上節(jié)講了數(shù)據(jù)歸一化倾剿,但是真正用到機(jī)器學(xué)習(xí)算法中的時(shí)候檀夹,一個(gè)注意事項(xiàng)就是蛋逾,之前將原始數(shù)據(jù)集拆分成訓(xùn)練數(shù)據(jù)集和測(cè)試數(shù)據(jù)集凝垛,如果我們要用歸一化后的數(shù)據(jù)集進(jìn)行模型訓(xùn)練的話懊悯,顯然我們要先對(duì)訓(xùn)練數(shù)據(jù)集進(jìn)行歸一化處理。對(duì)應(yīng)相應(yīng)的測(cè)試數(shù)據(jù)集也要進(jìn)行歸一化處理梦皮。那么定枷,對(duì)測(cè)試數(shù)據(jù)集如何進(jìn)行歸一化處理呢?
以均值和方差歸一化為例:首先我們需要將訓(xùn)練數(shù)據(jù)集進(jìn)行歸一化届氢,得到mean_train,std_train,那么對(duì)于測(cè)試集覆旭,用同樣的方法得到mean_test,std_test嗎退子?答案是錯(cuò)誤的。測(cè)試數(shù)據(jù)集要用下面的方法計(jì)算:
(X_test-mean_train)/ std_train
用這樣計(jì)算的好處是:

  1. 測(cè)試數(shù)據(jù)集是模擬模擬真實(shí)環(huán)境
  • 真實(shí)環(huán)境很有可能無(wú)法得到所有的測(cè)試
  1. 數(shù)據(jù)的均值和方差
  • 對(duì)數(shù)據(jù)的歸一化也是算法的一部分

訓(xùn)練出的模型是為了應(yīng)用在真是的場(chǎng)景中型将,但是在真是的場(chǎng)景中我們是無(wú)法測(cè)試均值和方法的寂祥,例如每次來(lái)了一個(gè)數(shù)據(jù),無(wú)法統(tǒng)計(jì)這個(gè)數(shù)據(jù)的均值和方差七兜,因此來(lái)的新的數(shù)據(jù)歸一化就要按照(X_test-mean_train)/ std_train來(lái)處理了丸凭。因此在操作中,我們是要保存訓(xùn)練數(shù)據(jù)集得到的均值和方差。
scikit-learn中對(duì)數(shù)據(jù)歸一化專(zhuān)門(mén)封裝了一個(gè)類(lèi)惜犀,叫Scaler下圖是Scalar這個(gè)類(lèi)的使用流程铛碑,圖中的fit就是求均值和方差(以均值和方差歸一化為例),predict改成了transform虽界,下面看一下示例:

# 使用鳶尾花這個(gè)數(shù)據(jù)集
import numpy as np
from sklearn import datasets

iris = datasets.load_iris()

X = iris.data
y = iris.target

# 數(shù)據(jù)集拆分成訓(xùn)練數(shù)據(jù)集和測(cè)試數(shù)據(jù)集
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2, random_state=666)

# 首先歲訓(xùn)練數(shù)據(jù)集進(jìn)行歸一化處理
from sklearn.preprocessing import StandardScaler

standardScaler = StandardScaler()

standardScaler.fit(X_train)
StandardScaler(copy=True, with_mean=True, with_std=True)
# 鳶尾花四個(gè)特征的均值
standardScaler.mean_
array([5.83416667, 3.08666667, 3.70833333, 1.17      ])
# 數(shù)據(jù)分布
standardScaler.scale_
array([0.81019502, 0.44327067, 1.76401924, 0.75317107])
# 進(jìn)行歸一化處理汽烦,X_train本身不會(huì)改變,如果賦值則改變
standardScaler.transform(X_train)
array([[-0.90616043,  0.93246262, -1.30856471, -1.28788802],
       [-1.15301457, -0.19551636, -1.30856471, -1.28788802],
       [-0.16559799, -0.64670795,  0.22203084,  0.17260355],
       [ 0.45153738,  0.70686683,  0.95898425,  1.50032315],
       [-0.90616043, -1.32349533, -0.40154513, -0.09294037],
       [ 1.43895396,  0.25567524,  0.56216318,  0.30537551],
       [ 0.3281103 , -1.09789954,  1.0723617 ,  0.30537551],
       [ 2.1795164 , -0.19551636,  1.63924894,  1.23477923],
       [-0.78273335,  2.2860374 , -1.25187599, -1.42065998],
       [ 0.45153738, -2.00028272,  0.44878573,  0.43814747],
       [ 1.80923518, -0.42111215,  1.46918276,  0.83646335],
       [ 0.69839152,  0.25567524,  0.90229552,  1.50032315],
       [ 0.20468323,  0.70686683,  0.44878573,  0.57091943],
       [-0.78273335, -0.87230374,  0.10865339,  0.30537551],
       [-0.53587921,  1.38365421, -1.25187599, -1.28788802],
       [-0.65930628,  1.38365421, -1.25187599, -1.28788802],
       [-1.0295875 ,  0.93246262, -1.19518726, -0.75680017],
       [-1.77014994, -0.42111215, -1.30856471, -1.28788802],
       [-0.04217092, -0.87230374,  0.10865339,  0.03983159],
       [-0.78273335,  0.70686683, -1.30856471, -1.28788802],
       [-1.52329579,  0.70686683, -1.30856471, -1.15511606],
       [ 0.82181859,  0.25567524,  0.78891808,  1.10200727],
       [-0.16559799, -0.42111215,  0.27871956,  0.17260355],
       [ 0.94524567, -0.19551636,  0.39209701,  0.30537551],
       [ 0.20468323, -0.42111215,  0.44878573,  0.43814747],
       [-1.39986872,  0.25567524, -1.19518726, -1.28788802],
       [-1.15301457,  1.15805842, -1.30856471, -1.42065998],
       [ 1.06867274,  0.03007944,  1.0723617 ,  1.63309511],
       [ 0.57496445, -0.87230374,  0.67554063,  0.83646335],
       [ 0.3281103 , -0.64670795,  0.56216318,  0.03983159],
       [ 0.45153738, -0.64670795,  0.6188519 ,  0.83646335],
       [-0.16559799,  2.96282478, -1.25187599, -1.0223441 ],
       [ 0.57496445, -1.32349533,  0.67554063,  0.43814747],
       [ 0.69839152, -0.42111215,  0.33540828,  0.17260355],
       [-0.90616043,  1.60925001, -1.02512109, -1.0223441 ],
       [ 1.19209981, -0.64670795,  0.6188519 ,  0.30537551],
       [-0.90616043,  0.93246262, -1.30856471, -1.15511606],
       [-1.89357701, -0.19551636, -1.47863088, -1.42065998],
       [ 0.08125616, -0.19551636,  0.78891808,  0.83646335],
       [ 0.69839152, -0.64670795,  1.0723617 ,  1.23477923],
       [-0.28902506, -0.64670795,  0.67554063,  1.10200727],
       [-0.41245214, -1.54909113, -0.00472406, -0.22571233],
       [ 1.31552689,  0.03007944,  0.67554063,  0.43814747],
       [ 0.57496445,  0.70686683,  1.0723617 ,  1.63309511],
       [ 0.82181859, -0.19551636,  1.18573914,  1.36755119],
       [-0.16559799,  1.60925001, -1.13849854, -1.15511606],
       [ 0.94524567, -0.42111215,  0.50547446,  0.17260355],
       [ 1.06867274,  0.48127103,  1.12905042,  1.76586707],
       [-1.27644165, -0.19551636, -1.30856471, -1.42065998],
       [-1.0295875 ,  1.15805842, -1.30856471, -1.28788802],
       [ 0.20468323, -0.19551636,  0.6188519 ,  0.83646335],
       [-1.0295875 , -0.19551636, -1.19518726, -1.28788802],
       [ 0.3281103 , -0.19551636,  0.67554063,  0.83646335],
       [ 0.69839152,  0.03007944,  1.01567297,  0.83646335],
       [-0.90616043,  1.38365421, -1.25187599, -1.0223441 ],
       [-0.16559799, -0.19551636,  0.27871956,  0.03983159],
       [-1.0295875 ,  0.93246262, -1.36525344, -1.15511606],
       [-0.90616043,  1.60925001, -1.25187599, -1.15511606],
       [-1.52329579,  0.25567524, -1.30856471, -1.28788802],
       [-0.53587921, -0.19551636,  0.44878573,  0.43814747],
       [ 0.82181859, -0.64670795,  0.50547446,  0.43814747],
       [ 0.3281103 , -0.64670795,  0.16534211,  0.17260355],
       [-1.27644165,  0.70686683, -1.19518726, -1.28788802],
       [-0.90616043,  0.48127103, -1.13849854, -0.88957213],
       [-0.04217092, -0.87230374,  0.78891808,  0.96923531],
       [-0.28902506, -0.19551636,  0.22203084,  0.17260355],
       [ 0.57496445, -0.64670795,  0.78891808,  0.43814747],
       [ 1.06867274,  0.48127103,  1.12905042,  1.23477923],
       [ 1.68580811, -0.19551636,  1.18573914,  0.57091943],
       [ 1.06867274, -0.19551636,  0.8456068 ,  1.50032315],
       [-1.15301457,  0.03007944, -1.25187599, -1.42065998],
       [-1.15301457, -1.32349533,  0.44878573,  0.70369139],
       [-0.16559799, -1.32349533,  0.73222935,  1.10200727],
       [-1.15301457, -1.54909113, -0.23147896, -0.22571233],
       [-0.41245214, -1.54909113,  0.05196466, -0.09294037],
       [ 1.06867274, -1.32349533,  1.18573914,  0.83646335],
       [ 0.82181859, -0.19551636,  1.01567297,  0.83646335],
       [-0.16559799, -1.09789954, -0.11810151, -0.22571233],
       [ 0.20468323, -2.00028272,  0.73222935,  0.43814747],
       [ 1.06867274,  0.03007944,  0.56216318,  0.43814747],
       [-1.15301457,  0.03007944, -1.25187599, -1.28788802],
       [ 0.57496445, -1.32349533,  0.73222935,  0.96923531],
       [-1.39986872,  0.25567524, -1.36525344, -1.28788802],
       [ 0.20468323, -0.87230374,  0.78891808,  0.57091943],
       [-0.04217092, -1.09789954,  0.16534211,  0.03983159],
       [ 1.31552689,  0.25567524,  1.12905042,  1.50032315],
       [-1.77014994, -0.19551636, -1.36525344, -1.28788802],
       [ 1.56238103, -0.19551636,  1.24242787,  1.23477923],
       [ 1.19209981,  0.25567524,  1.24242787,  1.50032315],
       [-0.78273335,  0.93246262, -1.25187599, -1.28788802],
       [ 2.54979762,  1.60925001,  1.52587149,  1.10200727],
       [ 0.69839152, -0.64670795,  1.0723617 ,  1.36755119],
       [-0.28902506, -0.42111215, -0.06141278,  0.17260355],
       [-0.41245214,  2.51163319, -1.30856471, -1.28788802],
       [-1.27644165, -0.19551636, -1.30856471, -1.15511606],
       [ 0.57496445, -0.42111215,  1.0723617 ,  0.83646335],
       [-1.77014994,  0.25567524, -1.36525344, -1.28788802],
       [-0.53587921,  1.8348458 , -1.13849854, -1.0223441 ],
       [-1.0295875 ,  0.70686683, -1.19518726, -1.0223441 ],
       [ 1.06867274, -0.19551636,  0.73222935,  0.70369139],
       [-0.53587921,  1.8348458 , -1.36525344, -1.0223441 ],
       [ 2.30294347, -0.64670795,  1.69593766,  1.10200727],
       [-0.28902506, -0.87230374,  0.27871956,  0.17260355],
       [ 1.19209981, -0.19551636,  1.01567297,  1.23477923],
       [-0.41245214,  0.93246262, -1.36525344, -1.28788802],
       [-1.27644165,  0.70686683, -1.02512109, -1.28788802],
       [-0.53587921,  0.70686683, -1.13849854, -1.28788802],
       [ 2.30294347,  1.60925001,  1.69593766,  1.36755119],
       [ 1.31552689,  0.03007944,  0.95898425,  1.23477923],
       [-0.28902506, -1.32349533,  0.10865339, -0.09294037],
       [-0.90616043,  0.70686683, -1.25187599, -1.28788802],
       [-0.90616043,  1.60925001, -1.19518726, -1.28788802],
       [ 0.3281103 , -0.42111215,  0.56216318,  0.30537551],
       [-0.04217092,  2.0604416 , -1.42194216, -1.28788802],
       [-1.0295875 , -2.45147431, -0.11810151, -0.22571233],
       [ 0.69839152,  0.25567524,  0.44878573,  0.43814747],
       [ 0.3281103 , -0.19551636,  0.50547446,  0.30537551],
       [ 0.08125616,  0.25567524,  0.6188519 ,  0.83646335],
       [ 0.20468323, -2.00028272,  0.16534211, -0.22571233],
       [ 1.93266225, -0.64670795,  1.35580532,  0.96923531]])
X_train = standardScaler.transform(X_train)

X_train
array([[-0.90616043,  0.93246262, -1.30856471, -1.28788802],
       [-1.15301457, -0.19551636, -1.30856471, -1.28788802],
       [-0.16559799, -0.64670795,  0.22203084,  0.17260355],
       [ 0.45153738,  0.70686683,  0.95898425,  1.50032315],
       [-0.90616043, -1.32349533, -0.40154513, -0.09294037],
       [ 1.43895396,  0.25567524,  0.56216318,  0.30537551],
       [ 0.3281103 , -1.09789954,  1.0723617 ,  0.30537551],
       [ 2.1795164 , -0.19551636,  1.63924894,  1.23477923],
       [-0.78273335,  2.2860374 , -1.25187599, -1.42065998],
       [ 0.45153738, -2.00028272,  0.44878573,  0.43814747],
       [ 1.80923518, -0.42111215,  1.46918276,  0.83646335],
       [ 0.69839152,  0.25567524,  0.90229552,  1.50032315],
       [ 0.20468323,  0.70686683,  0.44878573,  0.57091943],
       [-0.78273335, -0.87230374,  0.10865339,  0.30537551],
       [-0.53587921,  1.38365421, -1.25187599, -1.28788802],
       [-0.65930628,  1.38365421, -1.25187599, -1.28788802],
       [-1.0295875 ,  0.93246262, -1.19518726, -0.75680017],
       [-1.77014994, -0.42111215, -1.30856471, -1.28788802],
       [-0.04217092, -0.87230374,  0.10865339,  0.03983159],
       [-0.78273335,  0.70686683, -1.30856471, -1.28788802],
       [-1.52329579,  0.70686683, -1.30856471, -1.15511606],
       [ 0.82181859,  0.25567524,  0.78891808,  1.10200727],
       [-0.16559799, -0.42111215,  0.27871956,  0.17260355],
       [ 0.94524567, -0.19551636,  0.39209701,  0.30537551],
       [ 0.20468323, -0.42111215,  0.44878573,  0.43814747],
       [-1.39986872,  0.25567524, -1.19518726, -1.28788802],
       [-1.15301457,  1.15805842, -1.30856471, -1.42065998],
       [ 1.06867274,  0.03007944,  1.0723617 ,  1.63309511],
       [ 0.57496445, -0.87230374,  0.67554063,  0.83646335],
       [ 0.3281103 , -0.64670795,  0.56216318,  0.03983159],
       [ 0.45153738, -0.64670795,  0.6188519 ,  0.83646335],
       [-0.16559799,  2.96282478, -1.25187599, -1.0223441 ],
       [ 0.57496445, -1.32349533,  0.67554063,  0.43814747],
       [ 0.69839152, -0.42111215,  0.33540828,  0.17260355],
       [-0.90616043,  1.60925001, -1.02512109, -1.0223441 ],
       [ 1.19209981, -0.64670795,  0.6188519 ,  0.30537551],
       [-0.90616043,  0.93246262, -1.30856471, -1.15511606],
       [-1.89357701, -0.19551636, -1.47863088, -1.42065998],
       [ 0.08125616, -0.19551636,  0.78891808,  0.83646335],
       [ 0.69839152, -0.64670795,  1.0723617 ,  1.23477923],
       [-0.28902506, -0.64670795,  0.67554063,  1.10200727],
       [-0.41245214, -1.54909113, -0.00472406, -0.22571233],
       [ 1.31552689,  0.03007944,  0.67554063,  0.43814747],
       [ 0.57496445,  0.70686683,  1.0723617 ,  1.63309511],
       [ 0.82181859, -0.19551636,  1.18573914,  1.36755119],
       [-0.16559799,  1.60925001, -1.13849854, -1.15511606],
       [ 0.94524567, -0.42111215,  0.50547446,  0.17260355],
       [ 1.06867274,  0.48127103,  1.12905042,  1.76586707],
       [-1.27644165, -0.19551636, -1.30856471, -1.42065998],
       [-1.0295875 ,  1.15805842, -1.30856471, -1.28788802],
       [ 0.20468323, -0.19551636,  0.6188519 ,  0.83646335],
       [-1.0295875 , -0.19551636, -1.19518726, -1.28788802],
       [ 0.3281103 , -0.19551636,  0.67554063,  0.83646335],
       [ 0.69839152,  0.03007944,  1.01567297,  0.83646335],
       [-0.90616043,  1.38365421, -1.25187599, -1.0223441 ],
       [-0.16559799, -0.19551636,  0.27871956,  0.03983159],
       [-1.0295875 ,  0.93246262, -1.36525344, -1.15511606],
       [-0.90616043,  1.60925001, -1.25187599, -1.15511606],
       [-1.52329579,  0.25567524, -1.30856471, -1.28788802],
       [-0.53587921, -0.19551636,  0.44878573,  0.43814747],
       [ 0.82181859, -0.64670795,  0.50547446,  0.43814747],
       [ 0.3281103 , -0.64670795,  0.16534211,  0.17260355],
       [-1.27644165,  0.70686683, -1.19518726, -1.28788802],
       [-0.90616043,  0.48127103, -1.13849854, -0.88957213],
       [-0.04217092, -0.87230374,  0.78891808,  0.96923531],
       [-0.28902506, -0.19551636,  0.22203084,  0.17260355],
       [ 0.57496445, -0.64670795,  0.78891808,  0.43814747],
       [ 1.06867274,  0.48127103,  1.12905042,  1.23477923],
       [ 1.68580811, -0.19551636,  1.18573914,  0.57091943],
       [ 1.06867274, -0.19551636,  0.8456068 ,  1.50032315],
       [-1.15301457,  0.03007944, -1.25187599, -1.42065998],
       [-1.15301457, -1.32349533,  0.44878573,  0.70369139],
       [-0.16559799, -1.32349533,  0.73222935,  1.10200727],
       [-1.15301457, -1.54909113, -0.23147896, -0.22571233],
       [-0.41245214, -1.54909113,  0.05196466, -0.09294037],
       [ 1.06867274, -1.32349533,  1.18573914,  0.83646335],
       [ 0.82181859, -0.19551636,  1.01567297,  0.83646335],
       [-0.16559799, -1.09789954, -0.11810151, -0.22571233],
       [ 0.20468323, -2.00028272,  0.73222935,  0.43814747],
       [ 1.06867274,  0.03007944,  0.56216318,  0.43814747],
       [-1.15301457,  0.03007944, -1.25187599, -1.28788802],
       [ 0.57496445, -1.32349533,  0.73222935,  0.96923531],
       [-1.39986872,  0.25567524, -1.36525344, -1.28788802],
       [ 0.20468323, -0.87230374,  0.78891808,  0.57091943],
       [-0.04217092, -1.09789954,  0.16534211,  0.03983159],
       [ 1.31552689,  0.25567524,  1.12905042,  1.50032315],
       [-1.77014994, -0.19551636, -1.36525344, -1.28788802],
       [ 1.56238103, -0.19551636,  1.24242787,  1.23477923],
       [ 1.19209981,  0.25567524,  1.24242787,  1.50032315],
       [-0.78273335,  0.93246262, -1.25187599, -1.28788802],
       [ 2.54979762,  1.60925001,  1.52587149,  1.10200727],
       [ 0.69839152, -0.64670795,  1.0723617 ,  1.36755119],
       [-0.28902506, -0.42111215, -0.06141278,  0.17260355],
       [-0.41245214,  2.51163319, -1.30856471, -1.28788802],
       [-1.27644165, -0.19551636, -1.30856471, -1.15511606],
       [ 0.57496445, -0.42111215,  1.0723617 ,  0.83646335],
       [-1.77014994,  0.25567524, -1.36525344, -1.28788802],
       [-0.53587921,  1.8348458 , -1.13849854, -1.0223441 ],
       [-1.0295875 ,  0.70686683, -1.19518726, -1.0223441 ],
       [ 1.06867274, -0.19551636,  0.73222935,  0.70369139],
       [-0.53587921,  1.8348458 , -1.36525344, -1.0223441 ],
       [ 2.30294347, -0.64670795,  1.69593766,  1.10200727],
       [-0.28902506, -0.87230374,  0.27871956,  0.17260355],
       [ 1.19209981, -0.19551636,  1.01567297,  1.23477923],
       [-0.41245214,  0.93246262, -1.36525344, -1.28788802],
       [-1.27644165,  0.70686683, -1.02512109, -1.28788802],
       [-0.53587921,  0.70686683, -1.13849854, -1.28788802],
       [ 2.30294347,  1.60925001,  1.69593766,  1.36755119],
       [ 1.31552689,  0.03007944,  0.95898425,  1.23477923],
       [-0.28902506, -1.32349533,  0.10865339, -0.09294037],
       [-0.90616043,  0.70686683, -1.25187599, -1.28788802],
       [-0.90616043,  1.60925001, -1.19518726, -1.28788802],
       [ 0.3281103 , -0.42111215,  0.56216318,  0.30537551],
       [-0.04217092,  2.0604416 , -1.42194216, -1.28788802],
       [-1.0295875 , -2.45147431, -0.11810151, -0.22571233],
       [ 0.69839152,  0.25567524,  0.44878573,  0.43814747],
       [ 0.3281103 , -0.19551636,  0.50547446,  0.30537551],
       [ 0.08125616,  0.25567524,  0.6188519 ,  0.83646335],
       [ 0.20468323, -2.00028272,  0.16534211, -0.22571233],
       [ 1.93266225, -0.64670795,  1.35580532,  0.96923531]])
# 對(duì)X_test數(shù)據(jù)集進(jìn)行歸一化處理
X_test_standard = standardScaler.transform(X_test)

X_test_standard
array([[-0.28902506, -0.19551636,  0.44878573,  0.43814747],
       [-0.04217092, -0.64670795,  0.78891808,  1.63309511],
       [-1.0295875 , -1.77468693, -0.23147896, -0.22571233],
       [-0.04217092, -0.87230374,  0.78891808,  0.96923531],
       [-1.52329579,  0.03007944, -1.25187599, -1.28788802],
       [-0.41245214, -1.32349533,  0.16534211,  0.17260355],
       [-0.16559799, -0.64670795,  0.44878573,  0.17260355],
       [ 0.82181859, -0.19551636,  0.8456068 ,  1.10200727],
       [ 0.57496445, -1.77468693,  0.39209701,  0.17260355],
       [-0.41245214, -1.09789954,  0.39209701,  0.03983159],
       [ 1.06867274,  0.03007944,  0.39209701,  0.30537551],
       [-1.64672287, -1.77468693, -1.36525344, -1.15511606],
       [-1.27644165,  0.03007944, -1.19518726, -1.28788802],
       [-0.53587921,  0.70686683, -1.25187599, -1.0223441 ],
       [ 1.68580811,  1.15805842,  1.35580532,  1.76586707],
       [-0.04217092, -0.87230374,  0.22203084, -0.22571233],
       [-1.52329579,  1.15805842, -1.53531961, -1.28788802],
       [ 1.68580811,  0.25567524,  1.29911659,  0.83646335],
       [ 1.31552689,  0.03007944,  0.78891808,  1.50032315],
       [ 0.69839152, -0.87230374,  0.90229552,  0.96923531],
       [ 0.57496445,  0.48127103,  0.56216318,  0.57091943],
       [-1.0295875 ,  0.70686683, -1.25187599, -1.28788802],
       [ 2.30294347, -1.09789954,  1.80931511,  1.50032315],
       [-1.0295875 ,  0.48127103, -1.30856471, -1.28788802],
       [ 0.45153738, -0.42111215,  0.33540828,  0.17260355],
       [ 0.08125616, -0.19551636,  0.27871956,  0.43814747],
       [-1.0295875 ,  0.25567524, -1.42194216, -1.28788802],
       [-0.41245214, -1.77468693,  0.16534211,  0.17260355],
       [ 0.57496445,  0.48127103,  1.29911659,  1.76586707],
       [ 2.30294347, -0.19551636,  1.35580532,  1.50032315]])
# 使用歸一化的數(shù)據(jù)進(jìn)行knn分類(lèi)
from sklearn.neighbors import KNeighborsClassifier

knn_clf = KNeighborsClassifier(n_neighbors=3)
knn_clf.fit(X_train, y_train)
KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',metric_params=None, n_jobs=None, n_neighbors=3, p=2,
           weights='uniform')
# 查看準(zhǔn)確度
knn_clf.score(X_test_standard, y_test)
1.0
# 假如說(shuō)測(cè)試數(shù)據(jù)集沒(méi)有進(jìn)行數(shù)據(jù)歸一化處理莉御,我們來(lái)看一下準(zhǔn)確度
knn_clf.score(X_test, y_test)
0.3333333333333333

我們可以看出這樣準(zhǔn)確度顯然太低了撇吞。
下面我們自己實(shí)現(xiàn)一個(gè)StandardScaler

import numpy as np


class StandardScaler:
    def __init__(self):
        self.mean_ = None;
        self.scale = None;

    def fit(self, X):
        assert X.ndim == 2, "The dimension of X must be 2"

        self.mean_ = np.array(np.mean(X[:, i]) for i in range(X.shape[1]))
        self.scale_ = np.array(np.std(X[:, i]) for i in range(X.shape[1]))

        return self

    def tranfrom(self, X):
        assert X.ndim == 2, "The dimension of X must be 2"
        assert self.mean_ is not None and self.scale_ is not None,\
            "must fit before transform!"
        assert X.shape == len(self.mean_), \
            "the feature number of X must be equal to mean_ and std_"
        resX = np.empty(shape=X.shape, dtype=float)
        for col in range(X.shape[1]):
            resX[:, col] = (X[:, col] - self.mean_[col]) / self.scale_[col]
        return resX
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末,一起剝皮案震驚了整個(gè)濱河市礁叔,隨后出現(xiàn)的幾起案子牍颈,更是在濱河造成了極大的恐慌,老刑警劉巖琅关,帶你破解...
    沈念sama閱讀 218,640評(píng)論 6 507
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件煮岁,死亡現(xiàn)場(chǎng)離奇詭異,居然都是意外死亡死姚,警方通過(guò)查閱死者的電腦和手機(jī)人乓,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 93,254評(píng)論 3 395
  • 文/潘曉璐 我一進(jìn)店門(mén),熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái)都毒,“玉大人色罚,你說(shuō)我怎么就攤上這事≌司ⅲ” “怎么了戳护?”我有些...
    開(kāi)封第一講書(shū)人閱讀 165,011評(píng)論 0 355
  • 文/不壞的土叔 我叫張陵,是天一觀的道長(zhǎng)瀑焦。 經(jīng)常有香客問(wèn)我腌且,道長(zhǎng),這世上最難降的妖魔是什么榛瓮? 我笑而不...
    開(kāi)封第一講書(shū)人閱讀 58,755評(píng)論 1 294
  • 正文 為了忘掉前任铺董,我火速辦了婚禮单旁,結(jié)果婚禮上雕沿,老公的妹妹穿的比我還像新娘公浪。我一直安慰自己淹遵,他們只是感情好杖剪,可當(dāng)我...
    茶點(diǎn)故事閱讀 67,774評(píng)論 6 392
  • 文/花漫 我一把揭開(kāi)白布涨冀。 她就那樣靜靜地躺著玄组,像睡著了一般障贸。 火紅的嫁衣襯著肌膚如雪凫乖。 梳的紋絲不亂的頭發(fā)上确垫,一...
    開(kāi)封第一講書(shū)人閱讀 51,610評(píng)論 1 305
  • 那天弓颈,我揣著相機(jī)與錄音,去河邊找鬼删掀。 笑死翔冀,一個(gè)胖子當(dāng)著我的面吹牛,可吹牛的內(nèi)容都是我干的爬迟。 我是一名探鬼主播橘蜜,決...
    沈念sama閱讀 40,352評(píng)論 3 418
  • 文/蒼蘭香墨 我猛地睜開(kāi)眼,長(zhǎng)吁一口氣:“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼付呕!你這毒婦竟也來(lái)了计福?” 一聲冷哼從身側(cè)響起,我...
    開(kāi)封第一講書(shū)人閱讀 39,257評(píng)論 0 276
  • 序言:老撾萬(wàn)榮一對(duì)情侶失蹤徽职,失蹤者是張志新(化名)和其女友劉穎象颖,沒(méi)想到半個(gè)月后,有當(dāng)?shù)厝嗽跇?shù)林里發(fā)現(xiàn)了一具尸體姆钉,經(jīng)...
    沈念sama閱讀 45,717評(píng)論 1 315
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡说订,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 37,894評(píng)論 3 336
  • 正文 我和宋清朗相戀三年,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了潮瓶。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片陶冷。...
    茶點(diǎn)故事閱讀 40,021評(píng)論 1 350
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡,死狀恐怖毯辅,靈堂內(nèi)的尸體忽然破棺而出埂伦,到底是詐尸還是另有隱情,我是刑警寧澤思恐,帶...
    沈念sama閱讀 35,735評(píng)論 5 346
  • 正文 年R本政府宣布沾谜,位于F島的核電站,受9級(jí)特大地震影響胀莹,放射性物質(zhì)發(fā)生泄漏基跑。R本人自食惡果不足惜,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,354評(píng)論 3 330
  • 文/蒙蒙 一描焰、第九天 我趴在偏房一處隱蔽的房頂上張望媳否。 院中可真熱鬧,春花似錦荆秦、人聲如沸逆日。這莊子的主人今日做“春日...
    開(kāi)封第一講書(shū)人閱讀 31,936評(píng)論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽(yáng)。三九已至搪哪,卻和暖如春靡努,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背。 一陣腳步聲響...
    開(kāi)封第一講書(shū)人閱讀 33,054評(píng)論 1 270
  • 我被黑心中介騙來(lái)泰國(guó)打工惑朦, 沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留兽泄,地道東北人。 一個(gè)月前我還...
    沈念sama閱讀 48,224評(píng)論 3 371
  • 正文 我出身青樓漾月,卻偏偏與公主長(zhǎng)得像病梢,于是被迫代替她去往敵國(guó)和親。 傳聞我的和親對(duì)象是個(gè)殘疾皇子梁肿,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 44,974評(píng)論 2 355

推薦閱讀更多精彩內(nèi)容