這是我Deep_in_mnist系列的第三篇博客
- 注意:這里的代碼都是在Jupyter Notebook中運行箱玷,原始的
.ipynb
文件可以在我的GitHub項目主頁上查看帜讲,其中的CNN_by_TensorFlow_with_LeNet-5_Architecture.ipynb
就是這篇博客的文件朱盐,里面包括代碼台囱、注釋以及交互式運行結(jié)果,界面十分友好譬重,讀者可以下載后直接在Jupyter Notebook中打開即可拒逮,在這里作者也強烈推薦使用Jupyter Notebook進行學(xué)習(xí)。 - 項目主頁 :GitHub:acphart/Deep_in_mnist 喜歡可以順便給個star哦 ~~~
介紹
項目介紹
- 這里使用TensorFlow搭建CNN識別mnist手寫數(shù)字特征
- 這份代碼參照LeNet-5架構(gòu)臀规,論文閱讀及下載地址:http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf)
- 架構(gòu)的詳細描述請看下面搭建CNN的注釋
步驟
1. 導(dǎo)入工具庫和準備數(shù)據(jù)
import tensorflow as tf
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
import warnings
# os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
# warnings.filterwarnings('ignore')
from IPython.core.interactiveshell import InteractiveShell
# InteractiveShell.ast_node_interactivity = 'all'
- 這里的all_mnist_data.csv是重新包裝后的所有原始mnist數(shù)據(jù)滩援,共70000個手寫數(shù)字,數(shù)據(jù)詳情及下載請閱讀我GitHub主頁上的介紹GitHub:acphart/Deep_in_mnist
data = pd.read_csv('../../dataset/all_mnist_data.csv').values
'''
切分數(shù)據(jù)塔嬉,訓(xùn)練集為59000玩徊, 交叉驗證集為1000, 測試集為10000邑遏;
交叉驗證集過大會導(dǎo)致內(nèi)存溢出(gpu內(nèi)存不足),同時也不必要設(shè)置太大恰矩,
1000足夠了记盒,太大了還會拖慢訓(xùn)練速度;
'''
tr_r = 59000
cv_r = 60000
train = data[:tr_r]
cv = data[tr_r:cv_r]
test = data[cv_r:]
2. 定義搭建CNN的相關(guān)函數(shù)
'''
向量化函數(shù)外傅,將相應(yīng)數(shù)字轉(zhuǎn)換成one-hot向量纪吮,如下:
0 => [1 0 0 0 0 0 0 0 0 0]
1 => [0 1 0 0 0 0 0 0 0 0]
...
9 => [0 0 0 0 0 0 0 0 0 1]
'''
def vectorize(y_flat):
n = y_flat.shape[0]
vectors = np.zeros((n, 10))
for i in range(n):
vectors[i][int(y_flat[i])] = 1.0
return vectors.astype(np.uint8)
'''權(quán)重初始化函數(shù)'''
def init_weights(shape, name=None):
weights = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(weights, name=name)
'''偏置初始化函數(shù)'''
def init_biases(shape, name=None):
biases = tf.constant(0.1, shape=shape)
return tf.Variable(biases, name=name)
'''卷積函數(shù)俩檬,步長為1,返回與輸入圖像shape相同的特征映射(padding='SAME')'''
def conv2d(putin, conv_k, name=None):
return tf.nn.conv2d(putin, conv_k,
strides=[1, 1, 1, 1], padding='SAME', name=name)
'''池化函數(shù)碾盟,2*2最大池化棚辽,步長為2,池化后圖像長寬各減半'''
def max_pool22(putin, name=None):
return tf.nn.max_pool(putin, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME', name=name)
3. 搭建CNN
3.1 CNN結(jié)構(gòu)
我們這里參考LeNet-5網(wǎng)絡(luò)結(jié)構(gòu)冰肴,相關(guān)論文地址:http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf)
輸入層通過reshape原始的特征向量轉(zhuǎn)換為一批28x28單通道(灰度)的手寫數(shù)字圖片putin
putin隨后通過第一卷積層得到32個特征映射屈藐,然后經(jīng)第一層池化圖片大小縮減為14x14
經(jīng)過第二卷積層得到64個特征映射,然后經(jīng)第二層池化圖片大小縮減為7x7
然后進入全連接層熙尉,進入之前需要reshape第二池化層出來的Tensor
在全連接層設(shè)置棄權(quán)联逻,可以加速訓(xùn)練和防止過擬合
到達輸出層
這里我用中文注釋說明CNN的結(jié)構(gòu),英文注釋代表了Tensor維度的變化检痰,這里很容易體會到TensorFlow名字的意思=> 張量流
3.2 輸入層
'''x為原始輸入數(shù)據(jù)包归,即特征向量,None代表可以批量喂入數(shù)據(jù)'''
'''y為對應(yīng)輸入數(shù)據(jù)的期望輸出結(jié)果铅歼,即真實值'''
x = tf.placeholder(np.float32, [None, 784], name='x')
y = tf.placeholder(np.float32, [None, 10], name='y')
'''x is the original Tensor => [m, 784]'''
'''輸入層公壤,-1代表讓函數(shù)自動計算第一維的大小'''
'''這里將原始輸入轉(zhuǎn)換成一批單通道圖片'''
putin = tf.reshape(x, [-1, 28, 28, 1], name='putin')
'''after reshape, Tensor => [m, 28, 28, 1]'''
3.3 第一卷積層和池化層
'''第一卷積層的卷積核:5x5局部感受野,單通道椎椰,32個特征映射'''
'''使用修正線性單元ReLU作為激活函數(shù)'''
w_conv1 = init_weights([5, 5, 1, 32], name='w_conv1')
b_conv1 = init_biases([32], name='b_conv1')
h_conv1 = tf.nn.relu(conv2d(putin, w_conv1) + b_conv1, name='h_conv1')
'''after conv2d by w_conv1, padding_type is "SAME", Tensor => [m, 28, 28, 32]'''
'''第一池化層厦幅,2*2最大值池化'''
pool_1 = max_pool22(h_conv1, name='pool_1')
'''after pooling by [1, 2, 2, 1], padding_type is "SAME", Tensor => [m, 14, 14, 32]'''
3.4 第二卷積層和池化層
'''第二卷積層的卷積核:5x5局部感受野,32通道俭识,64個特征映射'''
'''依舊使用ReLU作為激活函數(shù)'''
w_conv2 = init_weights([5, 5, 32, 64], name='w_conv2')
b_conv2 = init_biases([64], name='b_conv2')
h_conv2 = tf.nn.relu(conv2d(pool_1, w_conv2) + b_conv2, name='h_conv2')
'''after conv2d by w_conv2, padding_type is "SAME", Tensor => [m, 14, 14, 64]'''
'''第二池化層慨削,2*2最大值池化'''
pool_2 = max_pool22(h_conv2, name='pool_2')
'''after pooling by [1, 2, 2, 1], padding_type is "SAME", Tensor => [m, 7, 7, 64]'''
3.5 全連接層(第一全連接層)
'''重構(gòu)第二池化層,接下來要進入全連接層full_connecting'''
pool_2_flat = tf.reshape(pool_2, [-1, 7*7*64], name='pool_2_flat')
'''after reshape, Tensor => [m, 7*7*64]'''
'''第一全連接層:1024個神經(jīng)元'''
'''使用ReLU作為激活函數(shù)'''
w_fc1 = init_weights([7*7*64, 1024], name='w_fc1')
b_fc1 = init_biases([1024], name='b_fc_1')
h_fc1 = tf.nn.relu(tf.matmul(pool_2_flat, w_fc1) + b_fc1, name='h_fc1')
'''after matmul by w_fc1, Tensor => [m, 1024]'''
3.6 輸出層(第二全連接層)
- 在這里設(shè)置棄權(quán)套媚,用以加快訓(xùn)練速度以及減低全連接層的過擬合缚态;
- 順便說明一下:卷積層一般不需要處理過擬合問題,因為卷積天然就具有很強的抵抗過擬合的特性堤瘤,過擬合其實理解起來就是模型在學(xué)習(xí)噪聲玫芦,而噪聲一般是隨機出現(xiàn)在訓(xùn)練數(shù)據(jù)的不同局部,而卷積核的共享權(quán)重意味著卷積核被強制從整個圖像中學(xué)習(xí)本辐,這使他們不太可能去選擇在訓(xùn)練數(shù)據(jù)中的局部特質(zhì)桥帆。
'''設(shè)置棄權(quán)'''
keep_prob = tf.placeholder('float', name='keep_prob')
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob, name='h_fc1_drop')
'''第二全連接層的權(quán)重和偏置'''
w_fc2 = init_weights([1024, 10], name='w_fc2')
b_fc2 = init_biases([10], name='b_fc2')
'''第二全連接層,即輸出層慎皱,使用柔性最大值函數(shù)softmax作為激活函數(shù)'''
y_ = tf.nn.softmax(tf.matmul(h_fc1_drop, w_fc2) + b_fc2, name='y_')
'''after matmul by w_fc2, Tensor => [m, 10]'''
4. CNN其他設(shè)置
4.1 設(shè)置超參數(shù)老虫、代價函數(shù),選擇優(yōu)化器茫多,計算正確率
- 關(guān)于學(xué)習(xí)率和批數(shù)據(jù)大小是經(jīng)過多次嘗試之后發(fā)現(xiàn)這個組合還不錯
'''設(shè)置迭代次數(shù)祈匙,學(xué)習(xí)率,批數(shù)據(jù)大小'''
epoches = 10000
alpha = 0.0002
batch_size = 200
'''使用交叉熵代價函數(shù)'''
cost_func = tf.reduce_sum(-y*tf.log(y_), name='cost_func')
'''使用梯度下降優(yōu)化器'''
train_step = tf.train.GradientDescentOptimizer(alpha).minimize(cost_func)
'''計算正確率'''
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1), name='correct_prediction')
accuracy = tf.reduce_mean(tf.cast(correct_prediction, 'float32'), name='accuracy')
5. 訓(xùn)練CNN
'''初始化全局變量'''
init = tf.global_variables_initializer()
'''使用交互式會話,便于求值'''
sess = tf.InteractiveSession()
sess.run(init)
'''記錄訓(xùn)練過程夺欲,作學(xué)習(xí)曲線圖'''
epoch_list = []
acc_list = []
cost_list = []
'''迭代訓(xùn)練'''
index = 20
i = 1
while i < epoches:
'''
這里我使用的迭代方式是跪帝,每次打亂整個訓(xùn)練集,然后按順序分批送入數(shù)據(jù)些阅,即:
1. 打亂數(shù)據(jù)np.random.shuffle(train)
2. 將前batch_size個數(shù)據(jù)喂入CNN
3. 檢查是否到達數(shù)據(jù)集尾部train.shape[0]伞剑,如果沒有,則喂入接下來的batch_size個數(shù)據(jù)
如果到達尾部市埋,回到第一步黎泣。
'''
begin_point = 0
np.random.shuffle(train)
while begin_point + batch_size < train.shape[0]:
'''獲取新的一批數(shù)據(jù),并喂入CNN訓(xùn)練'''
batch = train[begin_point: begin_point+batch_size]
x_batch = batch[:, 1:]
y_batch = vectorize(batch[:, 0])
sess.run(train_step, feed_dict={x: x_batch, y: y_batch, keep_prob:0.5})
begin_point = begin_point + batch_size
i = i + 1
if i > epoches: break
if i%index == 0:
'''計算驗證集的正確率和訓(xùn)練的代價函數(shù)值(損失值)'''
acc = accuracy.eval(feed_dict={x: cv[:, 1:],
y: vectorize(cv[:, 0]),
keep_prob:1.0})
cost = cost_func.eval(feed_dict={x: x_batch,
y: y_batch,
keep_prob:1.0})
print('epoches: {0:<4d}\t cost: {1:>9.4f}\t accuracy: {2:<.4f}'.format( i, cost, acc))
epoch_list.append(i)
acc_list.append(acc)
cost_list.append(cost)
if(i >= index*5): index = index*10
if i >=100:
index = 200
if i >=2000:
'''當?shù)_到2000次后腰素,減小學(xué)習(xí)率'''
alpha = 1e-5
- 訓(xùn)練過程輸出如下聘裁,每一行為:迭代次數(shù)、代價函數(shù)值弓千、交叉驗證集的正確率
epoches: 20 cost: 232.5098 accuracy: 0.7230
epoches: 40 cost: 99.7135 accuracy: 0.8920
epoches: 60 cost: 62.0796 accuracy: 0.9320
epoches: 80 cost: 56.0326 accuracy: 0.9470
epoches: 100 cost: 63.7895 accuracy: 0.9520
epoches: 200 cost: 34.7024 accuracy: 0.9600
epoches: 400 cost: 23.9385 accuracy: 0.9730
epoches: 600 cost: 19.7655 accuracy: 0.9820
epoches: 800 cost: 13.9432 accuracy: 0.9810
epoches: 1000 cost: 7.4313 accuracy: 0.9860
epoches: 1200 cost: 16.6722 accuracy: 0.9830
epoches: 1400 cost: 9.4147 accuracy: 0.9890
epoches: 1600 cost: 11.7775 accuracy: 0.9900
epoches: 1800 cost: 2.9766 accuracy: 0.9910
epoches: 2000 cost: 5.7111 accuracy: 0.9880
epoches: 2200 cost: 3.3531 accuracy: 0.9890
epoches: 2400 cost: 5.3885 accuracy: 0.9900
epoches: 2600 cost: 3.3719 accuracy: 0.9890
epoches: 2800 cost: 4.1312 accuracy: 0.9910
epoches: 3000 cost: 4.2502 accuracy: 0.9910
epoches: 3200 cost: 3.8794 accuracy: 0.9910
epoches: 3400 cost: 7.4820 accuracy: 0.9930
epoches: 3600 cost: 10.4082 accuracy: 0.9910
epoches: 3800 cost: 7.3295 accuracy: 0.9900
epoches: 4000 cost: 1.7250 accuracy: 0.9930
epoches: 4200 cost: 6.4778 accuracy: 0.9930
epoches: 4400 cost: 1.3318 accuracy: 0.9930
epoches: 4600 cost: 1.4021 accuracy: 0.9920
epoches: 4800 cost: 2.5861 accuracy: 0.9910
epoches: 5000 cost: 3.1131 accuracy: 0.9920
epoches: 5200 cost: 0.8810 accuracy: 0.9930
epoches: 5400 cost: 4.0778 accuracy: 0.9930
epoches: 5600 cost: 4.6981 accuracy: 0.9920
epoches: 5800 cost: 2.4814 accuracy: 0.9930
epoches: 6000 cost: 0.5687 accuracy: 0.9920
epoches: 6200 cost: 4.7754 accuracy: 0.9930
epoches: 6400 cost: 0.5672 accuracy: 0.9920
epoches: 6600 cost: 1.0349 accuracy: 0.9930
epoches: 6800 cost: 0.2849 accuracy: 0.9930
epoches: 7000 cost: 7.2503 accuracy: 0.9920
epoches: 7200 cost: 3.1297 accuracy: 0.9930
epoches: 7400 cost: 3.0174 accuracy: 0.9920
epoches: 7600 cost: 0.4067 accuracy: 0.9930
epoches: 7800 cost: 1.0140 accuracy: 0.9930
epoches: 8000 cost: 0.8347 accuracy: 0.9920
epoches: 8200 cost: 4.6941 accuracy: 0.9920
epoches: 8400 cost: 3.1090 accuracy: 0.9930
epoches: 8600 cost: 0.7467 accuracy: 0.9930
epoches: 8800 cost: 1.1472 accuracy: 0.9930
epoches: 9000 cost: 1.3889 accuracy: 0.9920
epoches: 9200 cost: 1.1031 accuracy: 0.9930
epoches: 9400 cost: 0.4943 accuracy: 0.9930
epoches: 9600 cost: 0.5199 accuracy: 0.9930
epoches: 9800 cost: 0.5397 accuracy: 0.9930
epoches: 10000 cost: 0.5592 accuracy: 0.9920
- 訓(xùn)練結(jié)果還行衡便,雖然訓(xùn)練過多(反正是睡覺的時候訓(xùn)練的 \(^o^)/ ),只要沒有過擬合就可以了~~~
6. 作學(xué)習(xí)曲線圖
- 有前面的輸出其實不用畫圖也行洋访,但圖比較直觀一點
- 圖像表明訓(xùn)練得還是不錯的镣陕,大概在5000左右開始飽和了
'''作出學(xué)習(xí)曲線圖'''
cost_list = np.array(cost_list)/cost_list[0]
fig, ax = plt.subplots(1, 1, sharex=True, sharey=True)
_ = ax.plot(epoch_list, acc_list, color='g', label='accuracy')
_ = ax.plot(epoch_list, cost_list, color='r', label='cost')
_ = ax.set_xscale('log')
_ = ax.set_ylim((0.0, 1.0))
_ = ax.set_xlabel('Epoches', fontsize=16)
_ = ax.set_xticklabels(labels=[1, 10, 100, 1000, 10000], fontsize=12)
_ = ax.set_yticklabels(labels=[0.0, 0.2, 0.4, 0.6, 0.8, 1.0], fontsize=12)
_ = ax.legend(fontsize=14)
7. 測試準確率
'''使用測試集測試準確率'''
'''直接把10000數(shù)據(jù)送進去會內(nèi)存溢出,所以分成10次算平均值'''
j = 0
b_size = 1000
acc_list = []
while j < test.shape[0]:
acc = accuracy.eval(feed_dict={x: test[j:j+b_size, 1:],
y: vectorize(test[j:j+b_size, 0]),
keep_prob:1.0})
print(acc, '\t', end='')
j = j+b_size
acc_list.append(acc)
print('')
test_accuracy = np.array(acc_list).mean()
print('test accuracy : {0}'.format(test_accuracy))
- 準確率為0.9929姻政,也就是說10000張里面就剩71張認不出來了
0.994 0.986 0.984 0.989 0.996 0.999 0.994 0.998 0.997 0.992
test accuracy : 0.992900013923645
8. 查看識別錯誤的數(shù)字
8.1 獲取預(yù)測值
'''和測試準確率時一樣呆抑,也是要分成十次計算'''
j = 0
b_size = 1000
pred_flat = np.empty((10, 1000))
while j < test.shape[0]:
prediction_j = y_.eval(feed_dict={x: test[j:j+b_size, 1:],
y: vectorize(test[j:j+b_size, 0]),
keep_prob:1.0})
pred_flat[j//1000] = [np.argmax(pred) for pred in prediction_j]
j = j+b_size
'''將對應(yīng)的預(yù)測值和真實值展開成一維數(shù)組'''
'''順便檢查一下正確率是否與前面相符'''
pred_y = pred_flat.reshape(10000)
real_y = test[:, 0].reshape(10000)
pred_acc = np.equal(pred_y, real_y).mean()
print('predicton accuracy : ', pred_acc)
predicton accuracy : 0.9929
正確率與前面相符
8.2 定義作圖函數(shù)
def show_pic(ax, image, y_=None, label=None, wh=28, cmap='Greys'):
'''
作圖函數(shù):
ax為Matplotlib.Axes對象;
image為單個的mnist手寫數(shù)字特征向量汁展,image.shape is (784,)鹊碍;
y_為預(yù)測值;
label為image對應(yīng)的真實數(shù)字標簽食绿。
wh為圖片的寬和高侈咕,默認是28
cmap是顏色映射,默認是'Greys'
'''
img = image.reshape(wh, wh)
ax.imshow(img, cmap=cmap)
ax.axis('off')
if y_ != None:
ax.text(28, 22, str(int(y_)), fontsize=16)
if label != None:
ax.text(28, 8, str(int(label)), color='r', fontsize=16)
8.3 作圖
fig, ax = plt.subplots(8, 9, sharex=True, sharey=True)
fig.set_size_inches(14, 8)
ax = ax.flatten()
ax_id = 0
i = 0
while ax_id < 72 :
image_i = test[i, 1:]
yi = real_y[i]
pred_i = pred_y[i]
if pred_i != yi:
'''若預(yù)測值與真實值不符器紧,則畫圖'''
show_pic(ax[ax_id], image_i, pred_i, yi)
ax_id = ax_id + 1
i = i + 1
if i>=10000: break
這剩下沒認出來的數(shù)字有些是挺難認的耀销,而且有幾個感覺預(yù)測的比真實的要好,比如第1行第4铲汪、6個數(shù)字熊尉,但還是有相當一部分是我們?nèi)丝梢砸谎壅J出來的,所以還有改進的空間掌腰。
思考:神經(jīng)網(wǎng)絡(luò)到底做了什么
這里僅做一些啟發(fā)式的描述狰住,以一個測試樣本為例,就比如說測試集的第一個數(shù)據(jù)
instance = test[0, 1:]
fig, ax = plt.subplots(1, 1)
show_pic(ax, instance)
- 樣例數(shù)據(jù)一眼能認出是7齿梁,看看模型的識別出來的值
pred_i = y_.eval(feed_dict={x: instance.reshape(1, 784),
y: vectorize(test[j:j+b_size, 0]),
keep_prob:1.0})
pred_num = np.argmax(pred_i)
print(pred_num)
7
- 我們的CNN模型的預(yù)測值也是7催植,再看一下它的真實值
print(pred_num == test[0, 0])
True
- 預(yù)測值和真實值相等,Go on
第一層卷積的32個特征映射
conv_1_feature = h_conv1.eval(feed_dict={x: instance.reshape(1, 784),
y: vectorize(test[j:j+b_size, 0]),
keep_prob:1.0})
'''conv_1_feature.shape is (1, 28, 28, 32)'''
fig, ax = plt.subplots(4, 8)
fig.set_size_inches(12, 6)
ax = ax.flatten()
for i in range(32):
conv_1_img = conv_1_feature[:, :, :, i].reshape(784)
show_pic(ax[i], conv_1_img, cmap='gist_heat')
這32個圖片的明暗不同,說明每個特征檢測的方式不一樣
接著看第一層池化后是什么樣子
pool_1_feature = pool_1.eval(feed_dict={x: instance.reshape(1, 784),
y: vectorize(test[j:j+b_size, 0]),
keep_prob:1.0})
'''pool_1_feature.shape is (1, 14, 14, 32)'''
fig, ax = plt.subplots(4, 8)
fig.set_size_inches(12, 6)
ax = ax.flatten()
for i in range(32):
pool_1_img = pool_1_feature[:, :, :, i].reshape(14*14)
show_pic(ax[i], pool_1_img, wh=14, cmap='gist_heat')
- 和我們的池化作用相符查邢,看起來就是第一卷積層的縮小版
第二卷積層是64個特征
conv_2_feature = h_conv2.eval(feed_dict={x: instance.reshape(1, 784),
y: vectorize(test[j:j+b_size, 0]),
keep_prob:1.0})
'''conv_2_feature.shape is (1, 14, 14, 64)'''
fig, ax = plt.subplots(8, 8)
fig.set_size_inches(12, 12)
ax = ax.flatten()
for i in range(64):
conv_2_img = conv_2_feature[:, :, :, i].reshape(14*14)
show_pic(ax[i], conv_2_img, wh=14, cmap='gist_heat')
這個就有點看不懂了,但還是能隱約看出來映射的是圖片不同部位的特征
再下來就是第二層池化
pool_2_feature = pool_2.eval(feed_dict={x: instance.reshape(1, 784),
y: vectorize(test[j:j+b_size, 0]),
keep_prob:1.0})
'''pool_2_feature.shape is (1, 7, 7, 64)'''
fig, ax = plt.subplots(8, 8)
fig.set_size_inches(12, 12)
ax = ax.flatten()
for i in range(64):
pool_2_img = pool_2_feature[:, :, :, i].reshape(7*7)
show_pic(ax[i], pool_2_img, wh=7, cmap='gist_heat')
這個只能隱約再隱約地看出一點點特征了,具體的估計只有我們的CNN模型能看懂了 ~~~
好啦酵幕,演示就到這里了 ~~
最后扰藕,關(guān)閉會話,結(jié)束
sess.close()