利用 TensorFlow 識(shí)別 MNIST 數(shù)據(jù)集

剛剛接觸 TensorFlow 一段時(shí)間,覺得了解的還太少,因此一直不愿意動(dòng)筆寫關(guān)于其應(yīng)用的筆記资柔。學(xué)習(xí)代碼的一個(gè)很好的方法是對(duì)于同一個(gè)任務(wù),采用多種不同的方法來實(shí)現(xiàn)撵割,這樣也可以更加直觀的比較不同實(shí)現(xiàn)方式的優(yōu)劣贿堰。這里列出幾個(gè)利用 TensorFlow 搭建不同類型的神經(jīng)網(wǎng)絡(luò)來實(shí)現(xiàn) MNIST 字體識(shí)別的代碼,在此可以更加方便的對(duì)比不同網(wǎng)絡(luò)的構(gòu)建形式啡彬。為了便于理解羹与,做了很多的注釋故硅,放在這里以備自己查看使用,代碼版權(quán)歸屬于相應(yīng)的作者纵搁。

對(duì)于圖像數(shù)據(jù)首先想到的就是利用 CNN 進(jìn)行處理吃衅,這里首先列出從頭開始創(chuàng)建 CNN 的實(shí)現(xiàn)方式:

# this cell's code is adopted from Udacity
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("./mnist", one_hot=True, reshape=False)

# parameters
learning_rate = 0.00001
epochs = 10
batch_size = 128

# number of samples to calculate validation and accuracy
test_valid_size = 256

# network parameters
n_classes = 10
dropout = 0.75

# weights and biases
# the shape of the filter weight is (height, width, input_depth, output_depth)
weights = {
    'wc1': tf.Variable(tf.random_normal([5, 5, 1, 32])),
    'wc2': tf.Variable(tf.random_normal([5, 5, 32, 64])),
    'wd1': tf.Variable(tf.random_normal([7*7*64, 1024])),
    'out': tf.Variable(tf.random_normal([1024, n_classes]))}

# the shape of the filter bias is (output_depth,)
biases = {
    'bc1': tf.Variable(tf.random_normal([32])),
    'bc2': tf.Variable(tf.random_normal([64])),
    'bd1': tf.Variable(tf.random_normal([1024])),
    'out': tf.Variable(tf.random_normal([n_classes]))}

# stride for each dimension (batch_size, input_height, input_width, depth)
# generally always set the stride for batch and input_channels
# i.e. the first and fourth element in the strides array to be 1
# This ensures that the model uses all batches and input channels
# It's good practice to remove the batches or channels you want to skip
# from the data set rather than use a stride to skip them
#
# tf.nn.conv2d requires the input be 4D (batch_size, height, width, depth)
def conv2d(x, W, b, strides=1):
    x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME')
    x = tf.nn.bias_add(x, b)
    # tf.add() doesn't work when the tensors aren't the same shape
    return tf.nn.relu(x)


def maxpool2d(x, k=2):
    return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, k, k, 1], padding='SAME')


def conv_net(x, weights, biases, dropout):
    # layer 1 - 28*28*1 to 14*14*32
    conv1 = conv2d(x, weights['wc1'], biases['bc1'])
    conv1 = maxpool2d(conv1, k=2)

    # layer 2 - 14*14*32 to 7*7*64
    conv2 = conv2d(conv1, weights['wc2'], biases['bc2'])
    conv2 = maxpool2d(conv2, k=2)

    # fully connected layer - 7*7*64 to 1024
    # tensor.get_shape().as_list() will return the shape of the tensor as a list
    fc1 = tf.reshape(conv2, [-1, weights['wd1'].get_shape().as_list()[0]])
    fc1 = tf.add(tf.matmul(fc1, weights['wd1']), biases['bd1'])
    fc1 = tf.nn.relu(fc1)
    fc1 = tf.nn.dropout(fc1, dropout)

    # output layer
    out = tf.add(tf.matmul(fc1, weights['out']), biases['out'])
    return out


# session
x = tf.placeholder(tf.float32, [None, 28, 28, 1])
y = tf.placeholder(tf.float32, [None, n_classes])
keep_prob = tf.placeholder(tf.float32)

# model
logits = conv_net(x, weights, biases, keep_prob)

# define loss and optimizer
cost = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)\
    .minimize(cost)

# accuracy
correct_pred = tf.equal(tf.argmax(logits, axis=1), tf.argmax(y, axis=1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# initializing the viriables
init = tf.global_variables_initializer()

# launch the graph
with tf.Session() as sess:
    sess.run(init)

    for epoch in range(epochs):
        for batch in range(mnist.train.num_examples//batch_size):
            batch_x, batch_y = mnist.train.next_batch(batch_size)
            sess.run(optimizer, feed_dict={
                x: batch_x,
                y: batch_y,
                keep_prob: dropout
            })

            # calculate batch loss and accuracy
            loss = sess.run(cost, feed_dict={
                x: batch_x,
                y: batch_y,
                keep_prob: 1.})
            valid_acc = sess.run(accuracy, feed_dict={
                x: mnist.validation.images[:test_valid_size],
                y: mnist.validation.labels[:test_valid_size],
                keep_prob: 1.})

            print('Epoch {:>2}, Batch {:>3} - loss: {:>10.4f} Validation Accuracy: {:.6f}'.format(
                  epoch + 1,
                  batch + 1,
                  loss,
                  valid_acc))

        test_acc = sess.run(accuracy, feed_dict={
            x: mnist.test.images[:test_valid_size],
            y: mnist.test.labels[:test_valid_size],
            keep_prob: 1.})
        print('Testing Accuracy: {}'.format(test_acc))

在 TensorFlow 中,還提供了一個(gè)更加方便的 tf.layers API腾誉,利用其來構(gòu)建這個(gè)同樣架構(gòu)的 CNN 的代碼如下:

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("./mnist", one_hot=True, reshape=False)

# parameters
learning_rate = 0.001
epochs = 10
batch_size = 128

# number of samples to calculate validation and accuracy
test_valid_size = 256

# network parameters
n_classes = 10
dropout = tf.placeholder(tf.float32)

# Input and target placeholders
inputs_ = tf.placeholder(tf.float32, (None, 28, 28, 1))
targets_ = tf.placeholder(tf.float32)

# build the conv2d graph with tf.layers.conv2d and tf.layers.max_pooling2d
# layer 1 - 28*28*1 to 14*14*32
conv1 = tf.layers.conv2d(inputs_, 32, (5, 5), padding='same', activation=tf.nn.relu)
maxpool1 = tf.layers.max_pooling2d(conv1, (2, 2), (2, 2))

# layer 2 - 14*14*32 to 7*7*64
conv2 = tf.layers.conv2d(maxpool1, 64, (5, 5), padding='same', activation=tf.nn.relu)
maxpool2 = tf.layers.max_pooling2d(conv2, (2, 2), (2, 2))

# Fully connected layer
flattened = tf.reshape(maxpool2, [-1, 7*7*64])
fc1 = tf.layers.dense(flattened, units=1024, activation=tf.nn.relu)
fc1 = tf.layers.dropout(fc1, rate=dropout)

# output logits
logits = tf.layers.dense(fc1, units=n_classes)

# define loss and optimizer
cost = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=targets_))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)\
    .minimize(cost)

# accuracy
correct_pred = tf.equal(tf.argmax(logits, axis=1), tf.argmax(targets_, axis=1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# initializing the viriables
init = tf.global_variables_initializer()

# launch the graph
with tf.Session() as sess:
    sess.run(init)

    for epoch in range(epochs):
        for batch in range(mnist.train.num_examples//batch_size):
            batch_x, batch_y = mnist.train.next_batch(batch_size)
            sess.run(optimizer, feed_dict={
                inputs_: batch_x,
                targets_: batch_y,
                dropout: 0.75})

            # calculate batch loss and accuracy
            loss = sess.run(cost, feed_dict={
                inputs_: batch_x,
                targets_: batch_y,
                dropout: 1.})
            valid_acc = sess.run(accuracy, feed_dict={
                inputs_: mnist.validation.images[:test_valid_size],
                targets_: mnist.validation.labels[:test_valid_size],
                dropout: 1.})

            print('Epoch {:>2}, Batch {:>3} - loss: {:>10.4f} Validation Accuracy: {:.6f}'.format(
                  epoch + 1,
                  batch + 1,
                  loss,
                  valid_acc))

        test_acc = sess.run(accuracy, feed_dict={
            inputs_: mnist.test.images[:test_valid_size],
            targets_: mnist.test.labels[:test_valid_size],
            dropout: 1.})
        print('Testing Accuracy: {}'.format(test_acc))

為了便于對(duì)比徘层,在此給出利用 TensorFlow 搭建一個(gè)標(biāo)準(zhǔn)的多層神經(jīng)網(wǎng)絡(luò) Standard neural network 來完成同樣識(shí)別任務(wù)的代碼,這個(gè) SNN 的最終識(shí)別率為 82%利职,如果可以無障礙的閱讀這兩段代碼趣效,那么對(duì)于 TensorFlow 的基本使用也就算是清楚了,代碼版權(quán)依然歸屬于 Udacity猪贪。

from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf

mnist = input_data.read_data_sets(".", one_hot=True, reshape=False)

# Parameters
learning_rate = 0.001
training_epochs = 20
batch_size = 128 
display_step = 1

n_input = 784  # MNIST data input (img shape: 28*28)
n_classes = 10  # MNIST total classes (0-9 digits)

n_hidden_layer = 256 # layer number of features

# Store layers weight & bias
weights = {
    'hidden_layer': tf.Variable(tf.random_normal([n_input, n_hidden_layer])),
    'out': tf.Variable(tf.random_normal([n_hidden_layer, n_classes]))
}
biases = {
    'hidden_layer': tf.Variable(tf.random_normal([n_hidden_layer])),
    'out': tf.Variable(tf.random_normal([n_classes]))
}

# tf Graph input
x = tf.placeholder("float", [None, 28, 28, 1])
y = tf.placeholder("float", [None, n_classes])

x_flat = tf.reshape(x, [-1, n_input])

# Hidden layer with RELU activation
layer_1 = tf.add(tf.matmul(x_flat, weights['hidden_layer']), biases['hidden_layer'])
layer_1 = tf.nn.relu(layer_1)
# Output layer with linear activation
logits = tf.matmul(layer_1, weights['out']) + biases['out']

# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)

# Initializing the variables
init = tf.global_variables_initializer()

# Launch the graph
with tf.Session() as sess:
    sess.run(init)
    # Training cycle
    for epoch in range(training_epochs):
        total_batch = int(mnist.train.num_examples/batch_size)
        # Loop over all batches
        for i in range(total_batch):
            batch_x, batch_y = mnist.train.next_batch(batch_size)
            # Run optimization op (backprop) and cost op (to get loss value)
            sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
        # Display logs per epoch step
        if epoch % display_step == 0:
            c = sess.run(cost, feed_dict={x: batch_x, y: batch_y})
            print("Epoch:", '%04d' % (epoch+1), "cost=", \
                "{:.9f}".format(c))
    print("Optimization Finished!")

    # Test model
    correct_prediction = tf.equal(tf.argmax(logits, axis=1), tf.argmax(y, axis=1))
    # Calculate accuracy
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
    # Decrease test_size if you don't have enough memory
    test_size = 256
    print("Accuracy:", accuracy.eval({x: mnist.test.images[:test_size], y: mnist.test.labels[:test_size]}))

除了使用 CNN 之外跷敬,利用 RNN 也同樣可以識(shí)別 MNIST 數(shù)據(jù)集,相應(yīng)的代碼如下:

# TensorFlow for RNN
# this cell's code is adopted from 
# https://jasdeep06.github.io/posts/Understanding-LSTM-in-Tensorflow-MNIST/

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("./mnist", one_hot=True)

# define constant
# the 28 x 28 lengh input data is unrolled into 28 time steps
time_steps = 28
# hidden LSTM units
num_units = 128
# each input is a row of 28 pixels
input_size = 28
# learning rate for Adam
learning_rate = 0.001
# there are 10 classes in the labels
n_classes = 10
# batch_size
batch_size = 128

# weights and biases for output layer
out_weights = tf.Variable(tf.random_normal([num_units, n_classes]))
out_bias = tf.Variable(tf.random_normal([n_classes]))

# defining inputs and labels placeholders
x = tf.placeholder(tf.float32, [None, time_steps, input_size])
y = tf.placeholder(tf.float32, [None, n_classes])

# processing the input tensor from [batch_size, time_steps, n_input] to 
# a 'time_steps' length list of [batch_size, n_input] tensors
inputs = tf.unstack(x, time_steps, 1)

# defining the network
lstm_layer = tf.contrib.rnn.BasicLSTMCell(num_units, forget_bias=1)
outputs, _ = tf.contrib.rnn.static_rnn(lstm_layer, inputs, dtype=tf.float32)

# converting last output of dimension [batch_size, num_units] to 
# [batch_size, n_classes] with matrix multplication
prediction = tf.matmul(outputs[-1], out_weights) + out_bias

# defining loss and optimization
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)

# model evaluation
correct_prediction = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# train the model
init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    iter = 1
    while iter < 800:
        batch_x, batch_y = mnist.train.next_batch(batch_size=batch_size)
        batch_x = batch_x.reshape((batch_size, time_steps, input_size))
        sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
        
        if iter % 10 == 0:
            acc = sess.run(accuracy, feed_dict={x: batch_x, y: batch_y})
            losses = sess.run(loss, feed_dict={x: batch_x, y: batch_y})
            print("For iter ", iter)
            print("Accuracy ", accuracy)
            print("Loss ", losses)
            print("___________________")
        iter += 1
        
        test_data = mnist.test.images[:128].reshape((-1, time_steps, input_size))
        test_label = mnist.test.labels[:128]
        print("Test Accuracy ", sess.run(accuracy, feed_dict={x: test_data, y: test_label}))

按照代碼中的參數(shù)設(shè)定热押,最終的識(shí)別率居然到了 96%西傀,RNN 果然無所不能。

參考閱讀

  1. Understanding LSTM in Tensorflow(MNIST dataset)
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末桶癣,一起剝皮案震驚了整個(gè)濱河市池凄,隨后出現(xiàn)的幾起案子,更是在濱河造成了極大的恐慌鬼廓,老刑警劉巖,帶你破解...
    沈念sama閱讀 223,002評(píng)論 6 519
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件致盟,死亡現(xiàn)場離奇詭異碎税,居然都是意外死亡,警方通過查閱死者的電腦和手機(jī)馏锡,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 95,357評(píng)論 3 400
  • 文/潘曉璐 我一進(jìn)店門雷蹂,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人杯道,你說我怎么就攤上這事匪煌。” “怎么了党巾?”我有些...
    開封第一講書人閱讀 169,787評(píng)論 0 365
  • 文/不壞的土叔 我叫張陵萎庭,是天一觀的道長。 經(jīng)常有香客問我齿拂,道長驳规,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 60,237評(píng)論 1 300
  • 正文 為了忘掉前任署海,我火速辦了婚禮吗购,結(jié)果婚禮上医男,老公的妹妹穿的比我還像新娘。我一直安慰自己捻勉,他們只是感情好镀梭,可當(dāng)我...
    茶點(diǎn)故事閱讀 69,237評(píng)論 6 398
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著踱启,像睡著了一般报账。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上禽捆,一...
    開封第一講書人閱讀 52,821評(píng)論 1 314
  • 那天笙什,我揣著相機(jī)與錄音,去河邊找鬼胚想。 笑死琐凭,一個(gè)胖子當(dāng)著我的面吹牛,可吹牛的內(nèi)容都是我干的浊服。 我是一名探鬼主播统屈,決...
    沈念sama閱讀 41,236評(píng)論 3 424
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼牙躺!你這毒婦竟也來了愁憔?” 一聲冷哼從身側(cè)響起,我...
    開封第一講書人閱讀 40,196評(píng)論 0 277
  • 序言:老撾萬榮一對(duì)情侶失蹤孽拷,失蹤者是張志新(化名)和其女友劉穎吨掌,沒想到半個(gè)月后,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體脓恕,經(jīng)...
    沈念sama閱讀 46,716評(píng)論 1 320
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡膜宋,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 38,794評(píng)論 3 343
  • 正文 我和宋清朗相戀三年,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了炼幔。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片秋茫。...
    茶點(diǎn)故事閱讀 40,928評(píng)論 1 353
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡,死狀恐怖乃秀,靈堂內(nèi)的尸體忽然破棺而出肛著,到底是詐尸還是另有隱情,我是刑警寧澤跺讯,帶...
    沈念sama閱讀 36,583評(píng)論 5 351
  • 正文 年R本政府宣布枢贿,位于F島的核電站,受9級(jí)特大地震影響刀脏,放射性物質(zhì)發(fā)生泄漏萨咕。R本人自食惡果不足惜,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 42,264評(píng)論 3 336
  • 文/蒙蒙 一火本、第九天 我趴在偏房一處隱蔽的房頂上張望危队。 院中可真熱鬧聪建,春花似錦、人聲如沸茫陆。這莊子的主人今日做“春日...
    開封第一講書人閱讀 32,755評(píng)論 0 25
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽簿盅。三九已至挥下,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間桨醋,已是汗流浹背棚瘟。 一陣腳步聲響...
    開封第一講書人閱讀 33,869評(píng)論 1 274
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留喜最,地道東北人偎蘸。 一個(gè)月前我還...
    沈念sama閱讀 49,378評(píng)論 3 379
  • 正文 我出身青樓,卻偏偏與公主長得像瞬内,于是被迫代替她去往敵國和親迷雪。 傳聞我的和親對(duì)象是個(gè)殘疾皇子,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 45,937評(píng)論 2 361

推薦閱讀更多精彩內(nèi)容