數(shù)據(jù)集:MNIST
框架:Keras
顯卡:NVIDIA GEFORCE 750M
參考:Keras中文文檔
這是優(yōu)達(dá)學(xué)城的深度學(xué)習(xí)項(xiàng)目,數(shù)據(jù)集和需求都很簡(jiǎn)單,關(guān)鍵是為了熟悉框架的使用以及項(xiàng)目搭建的套路险耀,只要用很簡(jiǎn)單的卷積神經(jīng)網(wǎng)絡(luò)就能實(shí)現(xiàn)臭猜,準(zhǔn)確率輕輕松松就能上90%松捉。
需求描述
隨機(jī)從MNIST數(shù)據(jù)集中選擇5個(gè)或5個(gè)以下的數(shù)字,拼成一張圖片飘千,如下圖所示。搭建一個(gè)模型栈雳,識(shí)別圖片中的數(shù)字护奈,空白字符的類型為0。
項(xiàng)目實(shí)戰(zhàn)
載入數(shù)據(jù)集
keras有
from keras.datasets import mnist
(X_raw, y_raw), (X_raw_test, y_raw_test) = mnist.load_data()
n_train, n_test = X_raw.shape[0], X_raw_test.shape[0]
查看數(shù)據(jù)集
import matplotlib.pyplot as plt
import random
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
for i in range(15):
plt.subplot(3, 5, i+1)
index = random.randint(0, n_train-1)
plt.title(str(y_raw[index]))
plt.imshow(X_raw[index], cmap='gray')
plt.axis('off')
合成數(shù)據(jù)
載入數(shù)據(jù)集的時(shí)候?qū)?shù)據(jù)集分成了訓(xùn)練集X_raw和測(cè)試集X_test哥纫,這里需要從X_raw中隨機(jī)選取數(shù)字霉旗,然后拼成新的圖片,并將20%設(shè)為驗(yàn)證集蛀骇,防止模型過擬合厌秒。
注意:數(shù)字的長(zhǎng)度不一定為5,不到5的以空白填充松靡,最終圖片高28長(zhǎng)28x5=140
- 為什么將數(shù)據(jù)分成訓(xùn)練集简僧、驗(yàn)證集和測(cè)試集?
訓(xùn)練集是用來訓(xùn)練模型的雕欺;驗(yàn)證集是用來對(duì)訓(xùn)練的模型進(jìn)行進(jìn)一步調(diào)參優(yōu)化岛马,如果使用測(cè)試集驗(yàn)證棉姐,網(wǎng)絡(luò)就會(huì)記住測(cè)試集,容易使模型過擬合啦逆;測(cè)試集用來測(cè)試模型表現(xiàn)伞矩。
難點(diǎn):
原圖是28x28,拼成28x140夏志,原來一行有28乃坤,現(xiàn)在一行有140,是每行做的append沟蔑,用list.append效率會(huì)很低湿诊,用矩陣轉(zhuǎn)置就會(huì)很快。
import numpy as np
from sklearn.model_selection import train_test_split
n_class, n_len, width, height = 11, 5, 28, 28
def generate_dataset(X, y):
X_len = X.shape[0] # 原數(shù)據(jù)集有幾個(gè)瘦材,新數(shù)據(jù)集還要有幾個(gè)
# 新數(shù)據(jù)集的shape為(X_len, 28, 28*5, 1)厅须,X_len是X的個(gè)數(shù),原數(shù)據(jù)集是28x28食棕,取5個(gè)數(shù)字(包含空白)拼接朗和,則為28x140, 1是顏色通道,灰度圖簿晓,所以是1
X_gen = np.zeros((X_len, height, width*n_len, 1), dtype=np.uint8)
# 新數(shù)據(jù)集對(duì)應(yīng)的label眶拉,最終的shape為(5, X_len,11)
y_gen = [np.zeros((X_len, n_class), dtype=np.uint8) for i in range(n_len)]
for i in range(X_len):
# 隨機(jī)確定數(shù)字長(zhǎng)度
rand_len = random.randint(1, 5)
lis = list()
# 設(shè)置每個(gè)數(shù)字
for j in range(0, rand_len):
# 隨機(jī)找一個(gè)數(shù)
index = random.randint(0, X_len - 1)
# 將對(duì)應(yīng)的y置1, y是經(jīng)過onehot編碼的憔儿,所以y的第三維是11忆植,0~9為10個(gè)數(shù)字,10為空白皿曲,哪個(gè)索引為1就是數(shù)字幾
y_gen[j][i][y[index]] = 1
lis.append(X[index].T)
# 其余位取空白
for m in range(rand_len, 5):
# 將對(duì)應(yīng)的y置1
y_gen[m][i][10] = 1
lis.append(np.zeros((28, 28),dtype=np.uint8))
lis = np.array(lis).reshape(140,28).T
X_gen[i] = lis.reshape(28,140,1)
return X_gen, y_gen
X_raw_train, X_raw_valid, y_raw_train, y_raw_valid = train_test_split(X_raw, y_raw, test_size=0.2, random_state=50)
X_train, y_train = generate_dataset(X_raw_train, y_raw_train)
X_valid, y_valid = generate_dataset(X_raw_valid, y_raw_valid)
X_test, y_test = generate_dataset(X_raw_test, y_raw_test)
顯示合成的圖片
# 顯示生成的圖片
for i in range(15):
plt.subplot(5, 3, i+1)
index = random.randint(0, n_test-1)
title = ''
for j in range(n_len):
title += str(np.argmax(y_test[j][index])) + ','
plt.title(title)
plt.imshow(X_test[index][:,:,0], cmap='gray')
plt.axis('off')
CNN搭建
使用了keras的函數(shù)式模型唱逢,很方便,可以參考官方文檔屋休。
由于數(shù)據(jù)集比較簡(jiǎn)答坞古,所以隨便一個(gè)網(wǎng)絡(luò)結(jié)構(gòu)都能有不錯(cuò)的表現(xiàn),我用的是兩層卷機(jī)模型劫樟,卷積層痪枫、最大池化層、卷積層叠艳、最大池化層奶陈,然后兩個(gè)全連接層。
from keras.models import Model
from keras.layers import *
import tensorflow as tf
# This returns a tensor
inputs = Input(shape=(28, 140, 1))
conv_11 = Conv2D(filters= 32, kernel_size=(5,5), padding='Same', activation='relu')(inputs)
max_pool_11 = MaxPool2D(pool_size=(2,2))(conv_11)
conv_12 = Conv2D(filters= 10, kernel_size=(3,3), padding='Same', activation='relu')(max_pool_11)
max_pool_12 = MaxPool2D(pool_size=(2,2), strides=(2,2))(conv_12)
flatten11 = Flatten()(max_pool_12)
hidden11 = Dense(15, activation='relu')(flatten11)
prediction1 = Dense(11, activation='softmax')(hidden11)
hidden21 = Dense(15, activation='relu')(flatten11)
prediction2 = Dense(11, activation='softmax')(hidden21)
hidden31 = Dense(15, activation='relu')(flatten11)
prediction3 = Dense(11, activation='softmax')(hidden31)
hidden41 = Dense(15, activation='relu')(flatten11)
prediction4 = Dense(11, activation='softmax')(hidden41)
hidden51 = Dense(15, activation='relu')(flatten11)
prediction5 = Dense(11, activation='softmax')(hidden51)
model = Model(inputs=inputs, outputs=[prediction1,prediction2,prediction3,prediction4,prediction5])
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
可視化網(wǎng)絡(luò)
依賴 pydot-ng 和 graphviz附较,若出現(xiàn)錯(cuò)誤吃粒,用命令行輸入pip install pydot-ng & brew install graphviz
windows需要安裝一下graphviz,配置一下環(huán)境
from keras.utils.vis_utils import plot_model, model_to_dot
from IPython.display import Image, SVG
SVG(model_to_dot(model).create(prog='dot', format='svg'))
訓(xùn)練模型
訓(xùn)練20代拒课,如果驗(yàn)證集上的準(zhǔn)確率連續(xù)兩次沒有提高徐勃,就減小學(xué)習(xí)率事示。顯卡不是很好,但依然很快僻肖,大概20分鐘左右就學(xué)好了肖爵。
from keras.callbacks import ReduceLROnPlateau
learnrate_reduce_1 = ReduceLROnPlateau(monitor='val_dense_2_acc', patience=2, verbose=1,factor=0.8, min_lr=0.00001)
learnrate_reduce_2 = ReduceLROnPlateau(monitor='val_dense_4_acc', patience=2, verbose=1,factor=0.8, min_lr=0.00001)
learnrate_reduce_3 = ReduceLROnPlateau(monitor='val_dense_6_acc', patience=2, verbose=1,factor=0.8, min_lr=0.00001)
learnrate_reduce_4 = ReduceLROnPlateau(monitor='val_dense_8_acc', patience=2, verbose=1,factor=0.8, min_lr=0.00001)
learnrate_reduce_5 = ReduceLROnPlateau(monitor='val_dense_10_acc', patience=2, verbose=1,factor=0.8, min_lr=0.00001)
model.fit(X_train, y_train, epochs=20, batch_size=128,
validation_data=(X_valid, y_valid),
callbacks=[learnrate_reduce_1,learnrate_reduce_2,learnrate_reduce_3,learnrate_reduce_4,learnrate_reduce_5])
計(jì)算模型準(zhǔn)確率
5個(gè)數(shù)字全部識(shí)別正確為正確,錯(cuò)一個(gè)即為錯(cuò)臀脏∪翱埃可以用循環(huán)一一比對(duì),我這里用了些概率論知識(shí)揉稚,因?yàn)槎际仟?dú)立事件秒啦,所以5個(gè)數(shù)字的準(zhǔn)確率乘起來就是模型準(zhǔn)確率。
def evaluate(model):
# TODO: 按照錯(cuò)一個(gè)就算錯(cuò)的規(guī)則計(jì)算準(zhǔn)確率.
result = model.evaluate(np.array(X_test).reshape(len(X_test),28,140,1), [y_test[0], y_test[1], y_test[2], y_test[3], y_test[4]], batch_size=32)
return result[6] * result[7] * result[8] * result[9] * result[10]
evaluate(model)
最后可以得到0.9476的正確率窃植。
預(yù)測(cè)值可視化
def get_result(result):
# 將 one_hot 編碼解碼
resultstr = ''
for i in range(n_len):
resultstr += str(np.argmax(result[i])) + ','
return resultstr
index = random.randint(0, n_test-1)
y_pred = model.predict(X_test[index].reshape(1,28,140,1))
plt.title('real: %s\npred:%s'%(get_result([y_test[x][index] for x in range(n_len)]), get_result(y_pred)))
plt.imshow(X_test[index,:,:,0], cmap='gray')
plt.axis('off')
保存模型
model.save('model.h5')
以上內(nèi)容來自822實(shí)驗(yàn)室神經(jīng)網(wǎng)絡(luò)知識(shí)分享
我們的822帝蒿,我們的青春
歡迎所有熱愛知識(shí)熱愛生活的朋友和822思享實(shí)驗(yàn)室一起成長(zhǎng),吃喝玩樂巷怜,享受知識(shí)。