[神經(jīng)網(wǎng)絡(luò)這次真的搞懂了!] (5) 使用神經(jīng)網(wǎng)絡(luò)識(shí)別手寫數(shù)字 - 手寫神經(jīng)網(wǎng)絡(luò)

英文原文：http://neuralnetworksanddeeplearning.com/
對原文的表達(dá)有部分改動(dòng)

讓我們編寫一個(gè)程序來使神經(jīng)網(wǎng)絡(luò)模型學(xué)習(xí)如何識(shí)別手寫數(shù)字橡疼，這里會(huì)用到我們已經(jīng)介紹過的隨機(jī)梯度下降和 MNIST 訓(xùn)練數(shù)據(jù)。我們將使用一個(gè)簡短的 Python 程序來完成這項(xiàng)工作，我們需要做的第一件事是獲取 MNIST 數(shù)據(jù)一姿。代碼下載方式如下：

git clone https://github.com/mnielsen/neural-networks-and-deep-learning.git

順便說一句渤刃，之前描述 MNIST 數(shù)據(jù)集時(shí)亥曹，我們描述它被分成了 60,000 張訓(xùn)練圖像和 10,000 張測試圖像汇恤，這是 MNIST 的官方描述始绍。實(shí)際上埠帕，我們將以稍微不同的方式拆分?jǐn)?shù)據(jù)垢揩。我們將保留測試圖像，但將 60,000 張圖像的 MNIST 訓(xùn)練集將分成兩部分：一組 50,000 張圖像的訓(xùn)練集（Train Set）敛瓷，以及一組 10,000 張圖像的驗(yàn)證集（Validation Set）叁巨。我們暫不會(huì)在本節(jié)中使用驗(yàn)證數(shù)據(jù)，但在本系列的后面我們會(huì)發(fā)現(xiàn)它在弄清楚如何設(shè)置神經(jīng)網(wǎng)絡(luò)的某些超參數(shù)（hyper-parameters）時(shí)很有用（比如學(xué)習(xí)率呐籽，這些一般不是由我們的學(xué)習(xí)算法直接得出的）锋勺。盡管驗(yàn)證數(shù)據(jù)集不是原始 MNIST 規(guī)范的一部分（MNIST 沒有定義某個(gè)10000的驗(yàn)證數(shù)據(jù)集），但許多人以這種方式使用 MNIST狡蝶，并且驗(yàn)證數(shù)據(jù)集的使用在神經(jīng)網(wǎng)絡(luò)中很常見庶橱。從現(xiàn)在開始，當(dāng)我提到“MNIST 訓(xùn)練數(shù)據(jù)”時(shí)贪惹，我將指的是我們的 50,000 個(gè)圖像的訓(xùn)練數(shù)據(jù)集（Train Set）苏章，而不是原始的 60,000 個(gè)圖像的數(shù)據(jù)集。

除了 MNIST 數(shù)據(jù)奏瞬，我們還需要一個(gè)名為 Numpy 的 Python 庫枫绅，用于進(jìn)行快速的線性代數(shù)相關(guān)計(jì)算。

在給出完整代碼之前硼端，讓我解釋一下神經(jīng)網(wǎng)絡(luò)代碼的核心類 -- Network 類撑瞧，我們用它來表示神經(jīng)網(wǎng)絡(luò)。這是我們用來初始化 Network 對象的代碼：

class Network(object):

    def __init__(self, sizes):
        self.num_layers = len(sizes)
        self.sizes = sizes
        self.biases = [np.random.randn(y, 1) for y in sizes[1:]]
        self.weights = [np.random.randn(y, x) 
                        for x, y in zip(sizes[:-1], sizes[1:])]

在此代碼中显蝌，列表 sizes 包含各個(gè)層中的神經(jīng)元數(shù)量预伺。例如订咸，如果我們想創(chuàng)建一個(gè)網(wǎng)絡(luò)對象，其中第一層有 2 個(gè)神經(jīng)元酬诀，第二層有 3 個(gè)神經(jīng)元脏嚷，最后一層有 1 個(gè)神經(jīng)元，那我們將執(zhí)行以下代碼：

net = Network([2, 3, 1])

Network 對象中的 weights 和 biases 都是隨機(jī)初始化的瞒御，使用 np.random.randn 函數(shù)生成均值為 0 父叙、標(biāo)準(zhǔn)差為 1 的高斯分布集合。這種隨機(jī)初始化為隨機(jī)梯度下降算法提供了一個(gè)起點(diǎn)肴裙。在后面的章節(jié)中趾唱，我們將找到初始化權(quán)重和偏差更好的方法，現(xiàn)在暫且使用這種方法蜻懦。請注意甜癞，網(wǎng)絡(luò)初始化代碼假定第一層神經(jīng)元是輸入層，并省略為這些神經(jīng)元設(shè)置任何偏差宛乃，因?yàn)槠顑H用于計(jì)算后面層的輸出悠咱。

此外，偏差和權(quán)重都被存儲(chǔ)為 Numpy 的矩陣列表征炼。例如 net.weights[1] 是一個(gè) Numpy 矩陣析既，存儲(chǔ)連接第二層和第三層神經(jīng)元的權(quán)重。（它不是第一層和第二層谆奥，因?yàn)?Python 的列表索引是從 0 開始的）由于 net.weights[1] 相當(dāng)冗長眼坏，我們只表示它的矩陣 $w$ 。 $w_{jk}$ 是第二層第 $k$ 個(gè)神經(jīng)元和第三層 $j$ 個(gè)神經(jīng)元之間連接的權(quán)重酸些。第三層神經(jīng)元的激活向量：
$a^′=σ(wa+b)$

其中空骚， $a$ 是第二層神經(jīng)元的激活向量。為了獲得 $a'$ 擂仍，我們將 $a$ 乘以（矩陣點(diǎn)乘）權(quán)重矩陣 $w$ 囤屹，并添加偏置向量 $b$ 。然后我們將函數(shù) $σ$ 逐元素應(yīng)用于向量 $wa+b$ 中的每一個(gè)逢渔。（這稱為 vectorizing the function $σ$ 肋坚。）方程 $a^′=σ(wa+b)$ 給出了與我們之前的規(guī)則方程 $\frac{1}{ 1 + e^{-(\sum_{j} w_jx_j + b)}}$ 相同的結(jié)果，它用于計(jì)算 sigmoid 神經(jīng)元的輸出肃廓。

使用向量化編寫計(jì)算 Network 實(shí)例輸出的代碼很容易智厌。我們首先定義 sigmoid 函數(shù)：

def sigmoid(z):
    return 1.0/(1.0+np.exp(-z))

請注意，當(dāng)輸入 z 是向量或 Numpy 數(shù)組時(shí)盲赊，Numpy 會(huì)自動(dòng)使用 sigmoid 處理各個(gè)元素铣鹏。

然后我們在 Network 類中添加一個(gè)前饋（feedforward）方法，該方法給定網(wǎng)絡(luò)的輸入 $a$ 哀蘑，返回相應(yīng)的輸出每一層的方程诚卸。（這里假設(shè)輸入 $a$ 是一個(gè) shape
為 $(n, 1)$ 的 ndarray葵第，而不是一個(gè) $(n,)$ 向量。這里合溺， $n$ 是網(wǎng)絡(luò)的輸入數(shù)量卒密。如果您嘗試使用 $(n,)$ 向量作為輸入，您會(huì)得到奇怪的結(jié)果棠赛。盡管使用 $(n,)$ 向量似乎是更自然的選擇哮奇，但使用 $(n, 1)$ 的 ndarray 可以特別輕松地修改代碼以一次前饋多個(gè)輸入，這很方便）：

    def feedforward(self, a):
        """Return the output of the network if "a" is input."""
        for b, w in zip(self.biases, self.weights):
            a = sigmoid(np.dot(w, a)+b)
        return a

當(dāng)然睛约，我們希望 Network 對象做的主要事情是學(xué)習(xí)鼎俘。為此，我們將為他們提供一種實(shí)現(xiàn)隨機(jī)梯度下降的 SGD 方法辩涝。有幾個(gè)地方有點(diǎn)神秘贸伐，但我會(huì)在后續(xù)對其進(jìn)行分解。

 def SGD(self, training_data, epochs, mini_batch_size, eta,
            test_data=None):
        """Train the neural network using mini-batch stochastic
        gradient descent.  The "training_data" is a list of tuples
        "(x, y)" representing the training inputs and the desired
        outputs.  The other non-optional parameters are
        self-explanatory.  If "test_data" is provided then the
        network will be evaluated against the test data after each
        epoch, and partial progress printed out.  This is useful for
        tracking progress, but slows things down substantially."""
        if test_data: n_test = len(test_data)
        n = len(training_data)
        for j in xrange(epochs):
            random.shuffle(training_data)
            mini_batches = [
                training_data[k:k+mini_batch_size]
                for k in xrange(0, n, mini_batch_size)]
            for mini_batch in mini_batches:
                self.update_mini_batch(mini_batch, eta)
            if test_data:
                print "Epoch {0}: {1} / {2}".format(
                    j, self.evaluate(test_data), n_test)
            else:
                print "Epoch {0} complete".format(j)

程序識(shí)別手寫數(shù)字的能力如何膀值？好吧，讓我們從加載 MNIST 數(shù)據(jù)開始误辑。我將使用一個(gè)小程序 mnist_loader.py 來完成此操作沧踏，如下所述。我們可以在 Python shell 中執(zhí)行以下命令（也可以在程序文件中）：

import mnist_loader
training_data, validation_data, test_data = mnist_loader.load_data_wrapper()

加載 MNIST 數(shù)據(jù)后巾钉，我們將建立一個(gè)具有 30 個(gè)隱藏神經(jīng)元的網(wǎng)絡(luò)翘狱。我們在導(dǎo)入上面列出的名為 Network 的類后執(zhí)行此操作：

import network
net = network.Network([784, 30, 10])

最后，我們將使用隨機(jī)梯度下降從 MNIST 的 training_data 中學(xué)習(xí)超過 30 個(gè) epoch砰苍，mini-batch 大小為 10潦匈，學(xué)習(xí)率為 $η=3.0$ ，

net.SGD(training_data, 30, 10, 3.0, test_data=test_data)

如果您趕時(shí)間赚导，可以通過減少 epoch 大小茬缩、減少隱藏神經(jīng)元的數(shù)量或僅使用部分訓(xùn)練數(shù)據(jù)來加快速度。請注意吼旧，這些 Python 腳本旨在幫助您了解神經(jīng)網(wǎng)絡(luò)的工作原理凰锡，它們并不是高性能代碼！當(dāng)然圈暗，一旦我們訓(xùn)練了一個(gè)網(wǎng)絡(luò)掂为，它確實(shí)可以非常快速员串，幾乎可以在任何計(jì)算平臺(tái)上運(yùn)行勇哗。例如，一旦我們?yōu)榫W(wǎng)絡(luò)學(xué)習(xí)了一組好的權(quán)重和偏差寸齐，就可以輕松地將其移植到 Web 瀏覽器中的 Javascript 中運(yùn)行欲诺，或作為移動(dòng)設(shè)備上的本機(jī)應(yīng)用程序運(yùn)行抄谐。正如下方你所看到的，在僅僅一個(gè) epoch 之后瞧栗，這個(gè)數(shù)字就達(dá)到了 10,000 中的 9,129斯稳，而且這個(gè)數(shù)字還在繼續(xù)增長：

Epoch 0: 9129 / 10000
Epoch 1: 9295 / 10000
Epoch 2: 9348 / 10000
...
Epoch 27: 9528 / 10000
Epoch 28: 9542 / 10000
Epoch 29: 9534 / 10000

也就是說，經(jīng)過訓(xùn)練的網(wǎng)絡(luò)在其峰值（“Epoch 28”）時(shí)為我們提供了大約95.42% 的成功識(shí)別率迹恐！作為第一次嘗試挣惰，這是非常令人鼓舞的。但是殴边，我應(yīng)該警告您憎茂，如果您運(yùn)行代碼，那么您的結(jié)果不一定會(huì)與我的完全相同锤岸，因?yàn)槲覀儗⑹褂茫ú煌模╇S機(jī)權(quán)重和偏差來初始化我們的網(wǎng)絡(luò)竖幔。為了在本章中生成結(jié)果，我進(jìn)行了三次的運(yùn)行是偷。

讓我們重新運(yùn)行上面的實(shí)驗(yàn)拳氢，將隱藏神經(jīng)元的數(shù)量更改為 100。這可能需要更長的時(shí)間蛋铆。

net = network.Network([784, 100, 10])
net.SGD(training_data, 30, 10, 3.0, test_data=test_data)

果然馋评，這將結(jié)果提高到 96.59%。至少在這種情況下刺啦，使用更多的隱藏神經(jīng)元可以幫助我們獲得更好的結(jié)果留特。

當(dāng)然，為了獲得這些準(zhǔn)確度玛瘸，我必須對訓(xùn)練周期數(shù)蜕青、mini-batch大小和學(xué)習(xí)率 $η$ 做出具體選擇。正如我上面提到的糊渊，這些被稱為我們神經(jīng)網(wǎng)絡(luò)的超參數(shù)（hyper-parameters）右核，以便將它們與我們的神經(jīng)網(wǎng)絡(luò)的參數(shù)（權(quán)重和偏差）區(qū)分開。如果我們選擇的超參數(shù)不當(dāng)渺绒，我們可能會(huì)得到糟糕的結(jié)果蒙兰。例如，假設(shè)我們選擇的學(xué)習(xí)率為 $η=0.001$ 芒篷，

net = network.Network([784, 100, 10])
net.SGD(training_data, 30, 10, 0.001, test_data=test_data)

結(jié)果不盡如人意：

Epoch 0: 1139 / 10000
Epoch 1: 1136 / 10000
Epoch 2: 1135 / 10000
...
Epoch 27: 2101 / 10000
Epoch 28: 2123 / 10000
Epoch 29: 2142 / 10000

但是搜变，您可以看到網(wǎng)絡(luò)的性能隨著時(shí)間的推移慢慢變好。這表明提高學(xué)習(xí)率针炉，比如 $η=0.01$ 挠他，我們會(huì)得到更好的結(jié)果。（如果做出改變可以改善事情篡帕，嘗試做更多Ｖ城帧）如果我們這樣做幾次贸呢，我們最終會(huì)得到類似于 $η=1.0$ 的學(xué)習(xí)率（也許可以微調(diào)到 3.0），這與我們之前的實(shí)驗(yàn)很接近拢军。由此可見楞陷，即使我們最初對超參數(shù)的選擇很糟糕，但我們至少獲得了足夠的信息來幫助我們改進(jìn)超參數(shù)茉唉。

通常固蛾，調(diào)試神經(jīng)網(wǎng)絡(luò)可能具有挑戰(zhàn)性。當(dāng)超參數(shù)的初始選擇產(chǎn)生的結(jié)果并不比隨機(jī)噪聲好時(shí)度陆，尤其如此艾凯。假設(shè)我們嘗試之前成功的 30 個(gè)隱藏神經(jīng)元網(wǎng)絡(luò)架構(gòu)，但將學(xué)習(xí)率更改為 $η=100.0$ ：

net = network.Network([784, 30, 10])
net.SGD(training_data, 30, 10, 100.0, test_data=test_data)

學(xué)習(xí)率太高了懂傀，結(jié)果變得更糟糕：

Epoch 0: 1009 / 10000
Epoch 1: 1009 / 10000
Epoch 2: 1009 / 10000
Epoch 3: 1009 / 10000
...
Epoch 27: 982 / 10000
Epoch 28: 982 / 10000
Epoch 29: 982 / 10000

當(dāng)然趾诗，我們從之前的實(shí)驗(yàn)中知道，正確的做法是降低學(xué)習(xí)率蹬蚁。但是恃泪，如果我們是第一次遇到這個(gè)問題，我們可能不僅要擔(dān)心學(xué)習(xí)率犀斋，還要擔(dān)心神經(jīng)網(wǎng)絡(luò)的其他方面贝乎。我們可能想知道我們是否以一種使網(wǎng)絡(luò)難以學(xué)習(xí)的方式初始化權(quán)重和偏差？或者我們可能沒有足夠的訓(xùn)練數(shù)據(jù)來獲得有意義的學(xué)習(xí)闪水？也許我們還沒有運(yùn)行足夠的迭代糕非？或者蒙具，這種架構(gòu)的神經(jīng)網(wǎng)絡(luò)無法學(xué)會(huì)識(shí)別手寫數(shù)字球榆？也許學(xué)習(xí)率太低？或者禁筏，學(xué)習(xí)率太高了持钉？當(dāng)您第一次遇到問題時(shí)，您無法立刻察覺到問題的原因篱昔。

調(diào)試神經(jīng)網(wǎng)絡(luò)并不簡單每强，就像普通編程一樣，它是一門藝術(shù)州刽。你需要學(xué)習(xí)調(diào)試的藝術(shù)才能從神經(jīng)網(wǎng)絡(luò)中獲得好的結(jié)果空执。

之前，我跳過了有關(guān)如何加載 MNIST 數(shù)據(jù)的詳細(xì)代碼穗椅。這很簡單辨绊。為了完整起見，這里是代碼：

"""
mnist_loader
~~~~~~~~~~~~

A library to load the MNIST image data.  For details of the data
structures that are returned, see the doc strings for ``load_data``
and ``load_data_wrapper``.  In practice, ``load_data_wrapper`` is the
function usually called by our neural network code.
"""

#### Libraries
# Standard library
import cPickle
import gzip

# Third-party libraries
import numpy as np

def load_data():
    """Return the MNIST data as a tuple containing the training data,
    the validation data, and the test data.

    The ``training_data`` is returned as a tuple with two entries.
    The first entry contains the actual training images.  This is a
    numpy ndarray with 50,000 entries.  Each entry is, in turn, a
    numpy ndarray with 784 values, representing the 28 * 28 = 784
    pixels in a single MNIST image.

    The second entry in the ``training_data`` tuple is a numpy ndarray
    containing 50,000 entries.  Those entries are just the digit
    values (0...9) for the corresponding images contained in the first
    entry of the tuple.

    The ``validation_data`` and ``test_data`` are similar, except
    each contains only 10,000 images.

    This is a nice data format, but for use in neural networks it's
    helpful to modify the format of the ``training_data`` a little.
    That's done in the wrapper function ``load_data_wrapper()``, see
    below.
    """
    f = gzip.open('../data/mnist.pkl.gz', 'rb')
    training_data, validation_data, test_data = cPickle.load(f)
    f.close()
    return (training_data, validation_data, test_data)

def load_data_wrapper():
    """Return a tuple containing ``(training_data, validation_data,
    test_data)``. Based on ``load_data``, but the format is more
    convenient for use in our implementation of neural networks.

    In particular, ``training_data`` is a list containing 50,000
    2-tuples ``(x, y)``.  ``x`` is a 784-dimensional numpy.ndarray
    containing the input image.  ``y`` is a 10-dimensional
    numpy.ndarray representing the unit vector corresponding to the
    correct digit for ``x``.

    ``validation_data`` and ``test_data`` are lists containing 10,000
    2-tuples ``(x, y)``.  In each case, ``x`` is a 784-dimensional
    numpy.ndarry containing the input image, and ``y`` is the
    corresponding classification, i.e., the digit values (integers)
    corresponding to ``x``.

    Obviously, this means we're using slightly different formats for
    the training data and the validation / test data.  These formats
    turn out to be the most convenient for use in our neural network
    code."""
    tr_d, va_d, te_d = load_data()
    training_inputs = [np.reshape(x, (784, 1)) for x in tr_d[0]]
    training_results = [vectorized_result(y) for y in tr_d[1]]
    training_data = zip(training_inputs, training_results)
    validation_inputs = [np.reshape(x, (784, 1)) for x in va_d[0]]
    validation_data = zip(validation_inputs, va_d[1])
    test_inputs = [np.reshape(x, (784, 1)) for x in te_d[0]]
    test_data = zip(test_inputs, te_d[1])
    return (training_data, validation_data, test_data)

def vectorized_result(j):
    """Return a 10-dimensional unit vector with a 1.0 in the jth
    position and zeroes elsewhere.  This is used to convert a digit
    (0...9) into a corresponding desired output from the neural
    network."""
    e = np.zeros((10, 1))
    e[j] = 1.0
    return e

我們的程序得到了很好的結(jié)果匹表。比什么好门坷？需要有一些簡單的（非神經(jīng)網(wǎng)絡(luò)）基線測試來比較宣鄙，以了解表現(xiàn)良好意味著什么。當(dāng)然默蚌，最簡單的基線是隨機(jī)猜測數(shù)字冻晤。大約百分之十的可能性是正確的。我們做得比這好得多了绸吸。

還有更好的基線嗎鼻弧？讓我們嘗試一個(gè)非常簡單的想法：我們將看看圖像有多“暗”。例如惯裕，2 的圖像通常比 1 的圖像暗很多（因?yàn)楦嗟南袼乇煌亢冢┪率缦吕荆?/p>

這建議使用訓(xùn)練數(shù)據(jù)來計(jì)算每個(gè)數(shù)字的平均暗度， $0,1,2,…,9$ 蜻势。當(dāng)呈現(xiàn)新圖像時(shí)撑刺，我們計(jì)算圖像的暗度，然后猜測它是哪個(gè)數(shù)字具有最接近的平均暗度握玛。這是一個(gè)簡單的過程够傍，很容易編寫代碼。但它比隨機(jī)猜測有了很大的改進(jìn)挠铲，在 10,000 張測試圖像中得到 2,225 張正確冕屯，即 22.25% 的準(zhǔn)確率。

"""
mnist_average_darkness
~~~~~~~~~~~~~~~~~~~~~~

A naive classifier for recognizing handwritten digits from the MNIST
data set.  The program classifies digits based on how dark they are
--- the idea is that digits like "1" tend to be less dark than digits
like "8", simply because the latter has a more complex shape.  When
shown an image the classifier returns whichever digit in the training
data had the closest average darkness.

The program works in two steps: first it trains the classifier, and
then it applies the classifier to the MNIST test data to see how many
digits are correctly classified.

Needless to say, this isn't a very good way of recognizing handwritten
digits!  Still, it's useful to show what sort of performance we get
from naive ideas."""

#### Libraries
# Standard library
from collections import defaultdict

# My libraries
import mnist_loader

def main():
    training_data, validation_data, test_data = mnist_loader.load_data()
    # training phase: compute the average darknesses for each digit,
    # based on the training data
    avgs = avg_darknesses(training_data)
    # testing phase: see how many of the test images are classified
    # correctly
    num_correct = sum(int(guess_digit(image, avgs) == digit)
                      for image, digit in zip(test_data[0], test_data[1]))
    print("Baseline classifier using average darkness of image.")
    print("{0} of {1} values correct.".format(num_correct, len(test_data[1])))

def avg_darknesses(training_data):
    """ Return a defaultdict whose keys are the digits 0 through 9.
    For each digit we compute a value which is the average darkness of
    training images containing that digit.  The darkness for any
    particular image is just the sum of the darknesses for each pixel."""
    digit_counts = defaultdict(int)
    darknesses = defaultdict(float)
    for image, digit in zip(training_data[0], training_data[1]):
        digit_counts[digit] += 1
        darknesses[digit] += sum(image)
    avgs = defaultdict(float)
    for digit, n in digit_counts.items():
        avgs[digit] = darknesses[digit] / n
    return avgs

def guess_digit(image, avgs):
    """Return the digit whose average darkness in the training data is
    closest to the darkness of ``image``.  Note that ``avgs`` is
    assumed to be a defaultdict whose keys are 0...9, and whose values
    are the corresponding average darknesses across the training data."""
    darkness = sum(image)
    distances = {k: abs(v-darkness) for k, v in avgs.items()}
    return min(distances, key=distances.get)

if __name__ == "__main__":
    main()

不難找到在 20 到 50% 準(zhǔn)確率范圍內(nèi)的其他方法拂苹。如果你再努力一點(diǎn)安聘，你可以提高 50% 以上。但是為了獲得更高的準(zhǔn)確度瓢棒，使用成熟的機(jī)器學(xué)習(xí)算法是有幫助的浴韭。讓我們嘗試使用最著名的算法之一，SVM 或支持向量機(jī)脯宿。如果您不熟悉 SVM念颈，不用擔(dān)心，我們不需要了解 SVM 工作原理的細(xì)節(jié)连霉。相反榴芳，我們將使用一個(gè)名為 scikit-learn 的 Python 庫。

如果我們使用默認(rèn)設(shè)置運(yùn)行 scikit-learn 的 SVM 分類器跺撼，那么 10,000 個(gè)測試圖像中的 9,435 個(gè)是正確的窟感。事實(shí)上，這意味著 SVM 的表現(xiàn)大致與我們的神經(jīng)網(wǎng)絡(luò)一樣好歉井。在后面的章節(jié)中柿祈，我們將介紹新技術(shù)，使我們能夠改進(jìn)我們的神經(jīng)網(wǎng)絡(luò)，使其性能比 SVM 好得多谍夭。

然而還沒結(jié)束黑滴。上述 94.35%是 scikit-learn 對 SVM 的默認(rèn)設(shè)置。 SVM 具有許多可調(diào)參數(shù)紧索。如果您想了解更多信息袁辈，請參閱 Andreas Mueller 的這篇博文
。 Mueller 表明珠漂，通過一些優(yōu)化 SVM 參數(shù)的工作晚缩，可以將性能提高到 98.5% 以上的準(zhǔn)確度。換句話說媳危，一個(gè)經(jīng)過良好調(diào)優(yōu)的 SVM 只在 70 中出現(xiàn)大約一位的錯(cuò)誤荞彼。神經(jīng)網(wǎng)絡(luò)能做得更好嗎？

目前待笑，精心設(shè)計(jì)的神經(jīng)網(wǎng)絡(luò)在解決 MNIST 問題上的表現(xiàn)優(yōu)于其他所有技術(shù)鸣皂，包括 SVM。2013 年的識(shí)別記錄正確分類了 10,000 張圖像中的 9,979 張暮蹂，這是由 Li Wan寞缝、Matthew Zeiler、Sixin Zhang仰泻、Yann LeCun 和 Rob Fergus 完成的荆陆。我們將在本書后面看到他們使用的大多數(shù)技術(shù)。在這種級別集侯，性能接近人類被啼，并且可以說更好，因?yàn)榧词谷祟愐埠茈y自信地識(shí)別相當(dāng)多的 MNIST 圖像棠枉，例如：

image.png

我相信你也會(huì)同意這些圖片很難分類浓体！在編程時(shí)，通常我們認(rèn)為解決像識(shí)別 MNIST 數(shù)字這樣的復(fù)雜問題需要復(fù)雜的算法术健。即使是剛剛提到的 Li Wan 等人論文中的神經(jīng)網(wǎng)絡(luò)也只涉及非常簡單的算法汹碱，即我們在本章中看到的算法的變體粘衬。所有的復(fù)雜性都是從訓(xùn)練數(shù)據(jù)中自動(dòng)學(xué)習(xí)的荞估。
$復(fù)雜算法≤簡單學(xué)習(xí)算法+良好的訓(xùn)練數(shù)據(jù)$

最后編輯于：2021.11.02 08:29:50

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末，一起剝皮案震驚了整個(gè)濱河市稚新，隨后出現(xiàn)的幾起案子勘伺，更是在濱河造成了極大的恐慌，老刑警劉巖褂删，帶你破解...
沈念sama閱讀 217,734評論 6贊 505
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件飞醉，死亡現(xiàn)場離奇詭異，居然都是意外死亡，警方通過查閱死者的電腦和手機(jī)缅帘，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 92,931評論 3贊 394
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門轴术，熙熙樓的掌柜王于貴愁眉苦臉地迎上來，“玉大人钦无，你說我怎么就攤上這事逗栽。” “怎么了失暂？”我有些...
開封第一講書人閱讀 164,133評論 0贊 354
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵彼宠，是天一觀的道長。經(jīng)常有香客問我弟塞，道長凭峡，這世上最難降的妖魔是什么？我笑而不...
開封第一講書人閱讀 58,532評論 1贊 293
?港島之戀（遺憾婚禮）
正文為了忘掉前任决记，我火速辦了婚禮摧冀，結(jié)果婚禮上，老公的妹妹穿的比我還像新娘系宫。我一直安慰自己按价，他們只是感情好，可當(dāng)我...
茶點(diǎn)故事閱讀 67,585評論 6贊 392
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布笙瑟。她就那樣靜靜地躺著楼镐，像睡著了一般。火紅的嫁衣襯著肌膚如雪往枷。梳的紋絲不亂的頭發(fā)上框产，一...
開封第一講書人閱讀 51,462評論 1贊 302
城市分裂傳說
那天，我揣著相機(jī)與錄音错洁，去河邊找鬼秉宿。笑死，一個(gè)胖子當(dāng)著我的面吹牛屯碴，可吹牛的內(nèi)容都是我干的描睦。我是一名探鬼主播，決...
沈念sama閱讀 40,262評論 3贊 418
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼导而，長吁一口氣：“原來是場噩夢啊……” “哼忱叭！你這毒婦竟也來了？” 一聲冷哼從身側(cè)響起今艺，我...
開封第一講書人閱讀 39,153評論 0贊 276
萬榮殺人案實(shí)錄
序言：老撾萬榮一對情侶失蹤韵丑，失蹤者是張志新（化名）和其女友劉穎，沒想到半個(gè)月后虚缎，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體撵彻，經(jīng)...
沈念sama閱讀 45,587評論 1贊 314
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 37,792評論 3贊 336
?白月光啟示錄
正文我和宋清朗相戀三年，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了陌僵。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片轴合。...
茶點(diǎn)故事閱讀 39,919評論 1贊 348
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡，死狀恐怖碗短，靈堂內(nèi)的尸體忽然破棺而出值桩，到底是詐尸還是另有隱情，我是刑警寧澤豪椿，帶...
沈念sama閱讀 35,635評論 5贊 345
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布奔坟，位于F島的核電站，受9級特大地震影響搭盾，放射性物質(zhì)發(fā)生泄漏咳秉。R本人自食惡果不足惜，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 41,237評論 3贊 329
男人毒藥：我在死后第九天來索命
文/蒙蒙一鸯隅、第九天我趴在偏房一處隱蔽的房頂上張望澜建。院中可真熱鬧，春花似錦蝌以、人聲如沸炕舵。這莊子的主人今日做“春日...
開封第一講書人閱讀 31,855評論 0贊 22
一樁弒父案跟畅，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽咽筋。三九已至，卻和暖如春徊件，著一層夾襖步出監(jiān)牢的瞬間奸攻，已是汗流浹背。一陣腳步聲響...
開封第一講書人閱讀 32,983評論 1贊 269
情欲美人皮
我被黑心中介騙來泰國打工虱痕，沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留睹耐，地道東北人。一個(gè)月前我還...
沈念sama閱讀 48,048評論 3贊 370
代替公主和親
正文我出身青樓部翘，卻偏偏與公主長得像硝训，于是被迫代替她去往敵國和親。傳聞我的和親對象是個(gè)殘疾皇子新思，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 44,864評論 2贊 354

[神經(jīng)網(wǎng)絡(luò)這次真的搞懂了!] (5) 使用神經(jīng)網(wǎng)絡(luò)識(shí)別手寫數(shù)字 - 手寫神經(jīng)網(wǎng)絡(luò)

推薦閱讀更多精彩內(nèi)容