準備工作
- keras
- tensorflow
- numpy
- PIL
下載MNIST數據集
from keras.dataset import mnist
mnist.load_data(path)
path是保存的路徑
模型結構
model1.png
這個模型用了兩個Convolution2D層抹估,兩個MaxPooling2D層,一個Flatten層,兩個全連接Dense層控妻,使用的激活函數是relu呀闻,優(yōu)化器是adam
訓練代碼
from keras.model import Sequential
from keras.layers import Convolution2D, Dense, Flatten, Activation, MaxPooling2D
from keras.utils import to_catagorical
from keras.optimizers import Adam
import numpy as np
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(60000, 28, 28, 1)
x_test = x_test.reshape(10000, 28, 28, 1)
y_test = to_categorical(y_test, 10)
y_train = to_categorical(y_train, 10)
# design model
model = Sequential()
model.add(Convolution2D(25, (5, 5), input_shape=(28, 28, 1)))
model.add(MaxPooling2D(2, 2))
model.add(Activation('relu'))
model.add(Convolution2D(50, (5, 5)))
model.add(MaxPooling2D(2, 2))
model.add(Activation('relu'))
model.add(Flatten())
model.add(Dense(50))
model.add(Activation('relu'))
model.add(Dense(10))
model.add(Activation('softmax'))
adam = Adam(lr=0.001)
# compile model
model.compile(optimizer=adam,loss='categorical_crossentropy',metrics=['accuracy'])
# training model
model.fit(x_train, y_train, batch_size=100, epochs=5)
# test model
print model.evaluate(x_test, y_test, batch_size=100)
# save model
model.save('/Users/zhang/Desktop/my_model2.h5')
訓練效果
Using TensorFlow backend.
Epoch 1/5
2017-09-12 14:49:32.779373: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-12 14:49:32.779389: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-12 14:49:32.779393: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-09-12 14:49:32.779398: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-12 14:49:32.779401: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
60000/60000 [==============================] - 33s - loss: 2.5862 - acc: 0.8057
Epoch 2/5
60000/60000 [==============================] - 32s - loss: 0.0603 - acc: 0.9820
Epoch 3/5
60000/60000 [==============================] - 32s - loss: 0.0409 - acc: 0.9873
Epoch 4/5
60000/60000 [==============================] - 32s - loss: 0.0338 - acc: 0.9895
Epoch 5/5
60000/60000 [==============================] - 33s - loss: 0.0259 - acc: 0.9922
9900/10000 [============================>.] - ETA: 0s[0.054905540546023986, 0.98440000832080843]
- 可以看到在測試集上識別準確率達到了98.44%厨相。
- 其實在訓練過程中存在運氣問題炼七,對于batch_size=100來說傀履,如果從一開始沒有跑出最優(yōu)值虱朵,可能就進入了死胡同,導致訓練的準確率一直只有9%钓账。
- 經過一輪訓練就能達到80%的準確率碴犬,這樣的訓練效果很有可能導致過擬合,雖然在測試集上有98%的準確率梆暮。
測試
我用ps畫出了幾個手寫數字的圖片進行測試
WX20170912-145816@2x.png
這些都是28*28的圖片
測試代碼如下
from keras.models import load_model
import numpy as np
from PIL import Image
def ImageToMatrix(filename):
im = Image.open(filename)
#change to greyimage
im=im.convert("L")
data = im.getdata()
data = np.matrix(data,dtype='int')
return data
model = load_model('/Users/zhang/Desktop/my_model.h5')
while 1:
i = input('number:')
j = input('type:')
data = ImageToMatrix('/Users/zhang/Desktop/picture/'+str(i)+'_'+str(j)+'.png')
data = np.array(data)
data = data.reshape(1, 28, 28, 1)
print 'test['+str(i)+'_'+str(j)+'], num='+str(i)+':'
print model.predict_classes(
data, batch_size=1, verbose=0
)
選取了幾個結果
number:7
type:1
test[7_1], num=7:
[7]
number:7
type:2
test[7_2], num=7:
[7]
number:7
type:3
test[7_3], num=7:
[7]
number:2
type:1
test[2_1], num=2:
[2]
number:1
type:1
test[1_1], num=1:
[4]
number:1
type:2
test[1_2], num=1:
[1]
number:1
type:3
test[1_3], num=1:
[1]
number:6
type:1
test[6_1], num=6:
[5]
總結:
- 該模型對于小字體沒法正常識別(和訓練集字體大小有關)
- 對于類似 '1' 等數字服协,如果放在圖片邊緣,如:1_1啦粹,沒法準確識別
- 當然偿荷,對于顛倒方向,橫放豎放的數字也沒法準確識別
- 在mnist的測試集中唠椭,可以說是10000張圖片只有大約200張識別錯誤
改進方案
- 對讀取進來的圖片進行先行處理(顛倒跳纳,居中,縮放等等),使得識別更加容易
- 對訓練集進行處理贪嫂,使用不同方向的數字訓練集(訓練量加大)
- 對神經層進行改進(由于沒有深入了解寺庄,其實這次的神經網絡也是瞎編的,只不過使用了卷積層和全連接層的結合)