李理：Theano tutorial和卷積神經(jīng)網(wǎng)絡(luò)的Theano實現(xiàn) Part2

本系列文章面向深度學(xué)習(xí)研發(fā)者红柱，希望通過Image Caption Generation，一個有意思的具體任務(wù)蓖乘，深入淺出地介紹深度學(xué)習(xí)的知識锤悄。本系列文章涉及到很多深度學(xué)習(xí)流行的模型，如CNN嘉抒，RNN/LSTM零聚，Attention等。本文為第9篇些侍。

作者：李理

目前就職于環(huán)信隶症，即時通訊云平臺和全媒體智能客服平臺，在環(huán)信從事智能客服和智能機(jī)器人相關(guān)工作岗宣，致力于用深度學(xué)習(xí)來提高智能機(jī)器人的性能蚂会。

李理：從Image Caption Generation理解深度學(xué)習(xí)（part II）

李理：從Image Caption Generation理解深度學(xué)習(xí)（part III）

李理：自動梯度求解反向傳播算法的另外一種視角

李理：自動梯度求解——cs231n的notes

李理：自動梯度求解——使用自動求導(dǎo)實現(xiàn)多層神經(jīng)網(wǎng)絡(luò)

李理：詳解卷積神經(jīng)網(wǎng)絡(luò)

李理：Theano tutorial和卷積神經(jīng)網(wǎng)絡(luò)的Theano實現(xiàn) Part1

接上文。

7. 使用Theano實現(xiàn)CNN

接下來我們繼續(xù)上文耗式，閱讀代碼network3.py胁住，了解怎么用Theano實現(xiàn)CNN。

完整的代碼參考這里刊咳。

7.1 FullyConnectedLayer類

首先我們看怎么用Theano實現(xiàn)全連接的層彪见。

class FullyConnectedLayer(object):? ? def __init__(self, n_in, n_out, activation_fn=sigmoid, p_dropout=0.0):? ? ? ? self.n_in = n_in? ? ? ? self.n_out = n_out? ? ? ? self.activation_fn = activation_fn? ? ? ? self.p_dropout = p_dropout? ? ? ? # Initialize weights and biases? ? ? ? self.w = theano.shared(? ? ? ? ? ? np.asarray(? ? ? ? ? ? ? ? np.random.normal(? ? ? ? ? ? ? ? ? ? loc=0.0, scale=np.sqrt(1.0/n_out), size=(n_in, n_out)),? ? ? ? ? ? ? ? dtype=theano.config.floatX),? ? ? ? ? ? name='w', borrow=True)? ? ? ? self.b = theano.shared(? ? ? ? ? ? np.asarray(np.random.normal(loc=0.0, scale=1.0, size=(n_out,)),? ? ? ? ? ? ? ? ? ? ? dtype=theano.config.floatX),? ? ? ? ? ? name='b', borrow=True)? ? ? ? self.params =[self.w, self.b]? ? def set_inpt(self, inpt, inpt_dropout, mini_batch_size):? ? ? ? self.inpt = inpt.reshape((mini_batch_size, self.n_in))? ? ? ? self.output = self.activation_fn(? ? ? ? ? ? (1-self.p_dropout)*T.dot(self.inpt, self.w) + self.b)? ? ? ? self.y_out = T.argmax(self.output, axis=1)? ? ? ? self.inpt_dropout = dropout_layer(? ? ? ? ? ? inpt_dropout.reshape((mini_batch_size, self.n_in)), self.p_dropout)? ? ? ? self.output_dropout = self.activation_fn(? ? ? ? ? ? T.dot(self.inpt_dropout, self.w) + self.b)? ? def accuracy(self, y):? ? ? ? "Return the accuracy for the mini-batch."? ? ? ? return T.mean(T.eq(y, self.y_out))

7.1.1 init

FullyConnectedLayer類的構(gòu)造函數(shù)主要是定義共享變量w和b，并且隨機(jī)初始化娱挨。參數(shù)的初始化非常重要余指，會影響模型的收斂速度甚至是否能收斂。這里把w和b初始化成均值0让蕾，標(biāo)準(zhǔn)差為sqrt(1.0/n_out)的隨機(jī)值浪规。有興趣的讀者可以參考這里。

此外探孝，這里使用了np.asarray函數(shù)笋婿。我們用np.random.normal生成了(n_in, n_out)的ndarray，但是這個ndarray的dtype是float64顿颅，但是我們?yōu)榱俗屗赡埽┰贕PU上運(yùn)算缸濒，需要用theano.config.floatX，所以用了np.asarray函數(shù)。這個函數(shù)和np.array不同的一點(diǎn)是它會盡量重用傳入的空間而不是深度拷貝庇配。

另外也會把激活函數(shù)activation_fn和dropout保存到self里斩跌。activation_fn是一個函數(shù)，可能使用靜態(tài)語言習(xí)慣的讀者不太習(xí)慣捞慌，其實可以理解為c語言的函數(shù)指針或者函數(shù)式變成語言的lambda之類的東西耀鸦。此外，init函數(shù)也把參數(shù)保存到self.params里邊啸澡，這樣的好處是之后把很多Layer拼成一個大的Network時所有的參數(shù)很容易通過遍歷每一層的params就行袖订。

7.1.2 set_input

set_inpt函數(shù)用來設(shè)置這一層的輸入并且計算輸出。這里使用了變量名為inpt而不是input的原因是input是Python的一個內(nèi)置函數(shù)嗅虏，容易混淆洛姑。注意我們通過兩種方式設(shè)置輸入：self.inpt和self.inpt_dropout。這樣做的原因是我們訓(xùn)練的時候需要dropout皮服。我們使用了一層dropout_layer楞艾，它會隨機(jī)的把dropout比例的神經(jīng)元的輸出設(shè)置成0。而測試的時候我們就不需要這個dropout_layer了龄广，但是要記得把輸出乘以(1-dropout)硫眯，因為我們訓(xùn)練的時候隨機(jī)的丟棄了dropout個神經(jīng)元，測試的時候沒有丟棄蜀细，那么輸出就會把訓(xùn)練的時候大舟铜，所以要乘以(1-dropout)，模擬丟棄的效果奠衔∽慌伲【當(dāng)然還有一種dropout的方式是訓(xùn)練是把輸出除以(1-dropout)，這樣預(yù)測的時候就不用在乘以(1-dropout)了归斤，感興趣的讀者可以參考這里】

def set_inpt(self, inpt, inpt_dropout, mini_batch_size):

self.inpt = inpt.reshape((mini_batch_size, self.n_in))

self.output = self.activation_fn( (1-self.p_dropout)*T.dot(self.inpt, self.w) + self.b)

self.y_out = T.argmax(self.output, axis=1)

self.inpt_dropout = dropout_layer(inpt_dropout.reshape((mini_batch_size, self.n_in)), self.p_dropout)

self.output_dropout = self.activation_fn( T.dot(self.inpt_dropout, self.w) + self.b)

下面我們逐行解讀痊夭。

1.reshape inpt

首先把input reshape成(batch_size, n_in)，為什么要reshape呢脏里？因為我們在CNN里通常在最后一個卷積pooling層后加一個全連接層她我，而CNN的輸出是4維的tensor(batch_size, num_filter, width, height)，我們需要把它reshape成(batch_size, num_filter * width * height)迫横。當(dāng)然我們定義網(wǎng)絡(luò)的時候就會指定n_in=num_filter width height了番舆。否則就不對了。

2.定義output

然后我們定義self.output矾踱。這是一個仿射變換恨狈，然后要乘以(1-p_dropout)，原因前面解釋過了呛讲。這是預(yù)測的時候用的輸入和輸出禾怠》捣睿【有點(diǎn)讀者可能會疑惑(包括我自己第一次閱讀時)，調(diào)用這個函數(shù)時會同時傳入inpt和inpt_dropout嗎吗氏？我們在Theano里只是”定義“符號變量從而定義這個計算圖芽偏，所以不是真的計算。我們訓(xùn)練的時候定義用的是cost損失函數(shù)弦讽，它用的是inpt_dropout和output_dropout污尉，而test的Theano函數(shù)是accuracy，用的是inpt和output以及y_out坦袍。

3.定義y_out

這個計算最終的輸出十厢，也就是當(dāng)這一層作為最后一層的時候輸出的分類結(jié)果等太。ConvPoolLayer是沒有實現(xiàn)y_out的計算的捂齐，因為我們不會把卷積作為網(wǎng)絡(luò)的輸出層，但是全連接層是有可能作為輸出的缩抡，所以通過argmax來選擇最大的那一個作為輸出奠宜。SoftmaxLayer是經(jīng)常作為輸出的，所以也實現(xiàn)了y_out瞻想。

4.inpt_dropout 先reshape压真，然后加一個dropout的op，這個op就是隨機(jī)的把一些神經(jīng)元的輸出設(shè)置成0

def dropout_layer(layer, p_dropout):

srng = shared_randomstreams.RandomStreams(np.random.RandomState(0).randint(999999))

mask = srng.binomial(n=1, p=1-p_dropout, size=layer.shape)

return layer*T.cast(mask, theano.config.floatX)

5.定義output_dropout

直接計算

ConvPoolLayer和SoftmaxLayer的代碼是類似的蘑险，這里就不贅述了滴肿。下面會有network3.py的完整代碼，感興趣的讀者可以自行閱讀佃迄。

但是也有一些細(xì)節(jié)值得注意泼差。對于ConvPoolLayer和SoftmaxLayer，我們需要根據(jù)對應(yīng)的公式計算輸出呵俏。不過非常幸運(yùn)堆缘，Theano提供了內(nèi)置的op，如卷積普碎，max-pooling吼肥，softmax函數(shù)等等。

當(dāng)我們實現(xiàn)softmax層時麻车，我們沒有討論怎么初始化weights和biases缀皱。之前我們討論過sigmoid層怎么初始化參數(shù)，但是那些方法不見得就適合softmax層动猬。這里直接初始化成0了啤斗。這看起來很隨意，不過在實踐中發(fā)現(xiàn)沒有太大問題枣察。

7.2 ConvPoolLayer類

7.2.1 init

def __init__(self, filter_shape, image_shape, poolsize=(2, 2),? ? ? ? ? ? ? ? activation_fn=sigmoid):? ? ? ? self.filter_shape = filter_shape? ? ? ? self.image_shape = image_shape? ? ? ? self.poolsize = poolsize? ? ? ? self.activation_fn=activation_fn? ? ? ? # initialize weights and biases? ? ? ? n_out = (filter_shape[0]*np.prod(filter_shape[2:])/np.prod(poolsize))? ? ? ? self.w = theano.shared(? ? ? ? ? ? np.asarray(? ? ? ? ? ? ? ? np.random.normal(loc=0, scale=np.sqrt(1.0/n_out), size=filter_shape),? ? ? ? ? ? ? ? dtype=theano.config.floatX),? ? ? ? ? ? borrow=True)? ? ? ? self.b = theano.shared(? ? ? ? ? ? np.asarray(? ? ? ? ? ? ? ? np.random.normal(loc=0, scale=1.0, size=(filter_shape[0],)),? ? ? ? ? ? ? ? dtype=theano.config.floatX),? ? ? ? ? ? borrow=True)? ? ? ? self.params =[self.w, self.b]

首先是參數(shù)争占。

1.filter_shape (num_filter, input_feature_map, filter_width, filter_height)

這個參數(shù)是filter的參數(shù)燃逻，第一個是這一層的filter的個數(shù)，第二個是輸入特征映射的個數(shù)臂痕，第三個是filter的width伯襟，第四個是filter的height

2.image_shape(mini_batch, input_feature_map, width, height)

輸入圖像的參數(shù)，第一個是mini_batch大小握童，第二個是輸入特征映射個數(shù)姆怪，必須要和filter_shape的第二個參數(shù)一樣！第三個是輸入圖像的width澡绩，第四個是height

3.poolsize

pooling的width和height稽揭，默認(rèn)2*2

4.activation_fn

激活函數(shù)，默認(rèn)是sigmoid

代碼除了保存這些參數(shù)之外就是定義共享變量w和b肥卡，然后保存到self.params里溪掀。

7.2.2 set_inpt

def set_inpt(self, inpt, inpt_dropout, mini_batch_size):

self.inpt = inpt.reshape(self.image_shape)

conv_out = conv.conv2d(

input=self.inpt, filters=self.w, filter_shape=self.filter_shape,

image_shape=self.image_shape)

pooled_out = downsample.max_pool_2d(

input=conv_out, ds=self.poolsize, ignore_border=True)

self.output = self.activation_fn(

pooled_out + self.b.dimshuffle('x', 0, 'x', 'x'))

self.output_dropout = self.output # no dropout in the convolutional layers

我們逐行解讀

1.reshape輸入

2.卷積

使用theano提供的conv2d op計算卷積

3.max-pooling

使用theano提供的max_pool_2d定義pooled_out

4.應(yīng)用激活函數(shù)

值得注意的是dimshuffle函數(shù)，pooled_out是(batch_size, num_filter, out_width, out_height)步鉴，b是num_filter的向量揪胃。我們需要通過broadcasting讓所有的pooled_out都加上一個bias，所以我們需要用dimshuffle函數(shù)把b變成(1,num_filter, 1, 1)的tensor氛琢。dimshuffle的參數(shù)’x’表示增加一個維度喊递，數(shù)字0表示原來這個tensor的第0維。 dimshuffle(‘x’, 0, ‘x’, ‘x’))的意思就是在原來這個vector的前面插入一個維度阳似，后面插入兩個維度骚勘，所以變成了(1,num_filter, 1, 1)的tensor。

5.output_dropout

卷積層沒有dropout撮奏，所以output和output_dropout是同一個符號變量

7.3 Network類

7.3.1 init

def __init__(self, layers, mini_batch_size):? ? ? ? self.layers = layers? ? ? ? self.mini_batch_size = mini_batch_size? ? ? ? self.params =[param for layer in self.layers for param in layer.params]? ? ? ? self.x = T.matrix("x")? ? ? ? self.y = T.ivector("y")? ? ? ? init_layer = self.layers[0]? ? ? ? init_layer.set_inpt(self.x, self.x, self.mini_batch_size)? ? ? ? for j in xrange(1, len(self.layers)):? ? ? ? ? ? prev_layer, layer? = self.layers[j-1], self.layers[j]? ? ? ? ? ? layer.set_inpt(? ? ? ? ? ? ? ? prev_layer.output, prev_layer.output_dropout, self.mini_batch_size)? ? ? ? self.output = self.layers[-1].output? ? ? ? self.output_dropout = self.layers[-1].output_dropout

參數(shù)layers就是網(wǎng)絡(luò)的所有Layers俏讹。

比如下面的代碼定義了一個三層的網(wǎng)絡(luò)，一個卷積pooling層挽荡，一個全連接層和一個softmax輸出層藐石，輸入大小是mini_batch_size 1 28 28的MNIST圖片，卷積層的輸出是mini_batch_size 20 24 24定拟，pooling之后是mini_batch_size 20 12 12于微。然后接一個全連接層，全連接層的輸入就是pooling的輸出20 12*12青自，輸出是100株依。最后是一個softmax，輸入是100延窜，輸出10恋腕。

net = Network([ConvPoolLayer(image_shape=(mini_batch_size, 1, 28, 28),? ? ? ? ? ? ? ? ? ? ? filter_shape=(20, 1, 5, 5),? ? ? ? ? ? ? ? ? ? ? poolsize=(2, 2)),? ? ? ? FullyConnectedLayer(n_in=20*12*12, n_out=100),? ? ? ? SoftmaxLayer(n_in=100, n_out=10)], mini_batch_size)

首先是保存layers和mini_batch_size

self.params=[param for layer in …]這行代碼把所有層的參數(shù)放到一個list里。Network.SGD方法會使用self.params來更新所以的參數(shù)逆瑞。self.x=T.matrix(“x”)和self.y=T.ivector(“y”)定義Theano符號變量x和y荠藤。這代表整個網(wǎng)絡(luò)的輸入和輸出伙单。

首先我們調(diào)用init_layer的set_inpt

init_layer = self.layers[0]? init_layer.set_inpt(self.x, self.x, self.mini_batch_size)

這里調(diào)用第一層的set_inpt函數(shù)。傳入的inpt和inpt_dropout都是self.x哈肖，因為不論是訓(xùn)練還是測試吻育，第一層的都是x。

然后從第二層開始：

for j in xrange(1, len(self.layers)):? ? ? ? ? ? prev_layer, layer? = self.layers[j-1], self.layers[j]? ? ? ? ? ? layer.set_inpt(? ? ? ? ? ? ? ? prev_layer.output, prev_layer.output_dropout, self.mini_batch_size)

拿到上一層prev_layer和當(dāng)前層layer淤井，然后把調(diào)用layer.set_inpt函數(shù)布疼，把上一層的output和output_dropout作為當(dāng)前層的inpt和inpt_dropout。

最后定義整個網(wǎng)絡(luò)的output和output_dropout`

self.output = self.layers[-1].output? ? ? ? self.output_dropout = self.layers[-1].output_dropout

7.3.2 SGD函數(shù)

def SGD(self, training_data, epochs, mini_batch_size, eta,? ? ? ? ? ? validation_data, test_data, lmbda=0.0):? ? ? ? """Train the network using mini-batch stochastic gradient descent."""? ? ? ? training_x, training_y = training_data? ? ? ? validation_x, validation_y = validation_data? ? ? ? test_x, test_y = test_data? ? ? ? # compute number of minibatches for training, validation and testing? ? ? ? num_training_batches = size(training_data)/mini_batch_size? ? ? ? num_validation_batches = size(validation_data)/mini_batch_size? ? ? ? num_test_batches = size(test_data)/mini_batch_size? ? ? ? # define the (regularized) cost function, symbolic gradients, and updates? ? ? ? l2_norm_squared = sum([(layer.w**2).sum() for layer in self.layers])? ? ? ? cost = self.layers[-1].cost(self)+\? ? ? ? ? ? ? 0.5*lmbda*l2_norm_squared/num_training_batches? ? ? ? grads = T.grad(cost, self.params)? ? ? ? updates =[(param, param-eta*grad)? ? ? ? ? ? ? ? ? for param, grad in zip(self.params, grads)]? ? ? ? # define functions to train a mini-batch, and to compute the? ? ? ? # accuracy in validation and test mini-batches.? ? ? ? i = T.lscalar() # mini-batch index? ? ? ? train_mb = theano.function(? ? ? ? ? ? , cost, updates=updates,? ? ? ? ? ? givens={? ? ? ? ? ? ? ? self.x:? ? ? ? ? ? ? ? training_x[i*self.mini_batch_size: (i+1)*self.mini_batch_size],? ? ? ? ? ? ? ? self.y:? ? ? ? ? ? ? ? training_y[i*self.mini_batch_size: (i+1)*self.mini_batch_size]? ? ? ? ? ? })? ? ? ? validate_mb_accuracy = theano.function(? ? ? ? ? ? , self.layers[-1].accuracy(self.y),? ? ? ? ? ? givens={? ? ? ? ? ? ? ? self.x:? ? ? ? ? ? ? ? validation_x[i*self.mini_batch_size: (i+1)*self.mini_batch_size],? ? ? ? ? ? ? ? self.y:? ? ? ? ? ? ? ? validation_y[i*self.mini_batch_size: (i+1)*self.mini_batch_size]? ? ? ? ? ? })? ? ? ? test_mb_accuracy = theano.function(? ? ? ? ? ? , self.layers[-1].accuracy(self.y),? ? ? ? ? ? givens={? ? ? ? ? ? ? ? self.x:? ? ? ? ? ? ? ? test_x[i*self.mini_batch_size: (i+1)*self.mini_batch_size],? ? ? ? ? ? ? ? self.y:? ? ? ? ? ? ? ? test_y[i*self.mini_batch_size: (i+1)*self.mini_batch_size]? ? ? ? ? ? })? ? ? ? self.test_mb_predictions = theano.function(? ? ? ? ? ? , self.layers[-1].y_out,? ? ? ? ? ? givens={? ? ? ? ? ? ? ? self.x:? ? ? ? ? ? ? ? test_x[i*self.mini_batch_size: (i+1)*self.mini_batch_size]? ? ? ? ? ? })? ? ? ? # Do the actual training? ? ? ? best_validation_accuracy = 0.0? ? ? ? for epoch in xrange(epochs):? ? ? ? ? ? for minibatch_index in xrange(num_training_batches):? ? ? ? ? ? ? ? iteration = num_training_batches*epoch+minibatch_index? ? ? ? ? ? ? ? if iteration % 1000 == 0:? ? ? ? ? ? ? ? ? ? print("Training mini-batch number {0}".format(iteration))? ? ? ? ? ? ? ? cost_ij = train_mb(minibatch_index)? ? ? ? ? ? ? ? if (iteration+1) % num_training_batches == 0:? ? ? ? ? ? ? ? ? ? validation_accuracy = np.mean([validate_mb_accuracy(j) for j in xrange(num_validation_batches)])? ? ? ? ? ? ? ? ? ? print("Epoch {0}: validation accuracy {1:.2%}".format(? ? ? ? ? ? ? ? ? ? ? ? epoch, validation_accuracy))? ? ? ? ? ? ? ? ? ? if validation_accuracy >= best_validation_accuracy:? ? ? ? ? ? ? ? ? ? ? ? print("This is the best validation accuracy to date.")? ? ? ? ? ? ? ? ? ? ? ? best_validation_accuracy = validation_accuracy? ? ? ? ? ? ? ? ? ? ? ? best_iteration = iteration? ? ? ? ? ? ? ? ? ? ? ? if test_data:? ? ? ? ? ? ? ? ? ? ? ? ? ? test_accuracy = np.mean([test_mb_accuracy(j) for j in xrange(num_test_batches)])? ? ? ? ? ? ? ? ? ? ? ? ? ? print('The corresponding test accuracy is {0:.2%}'.format(? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? test_accuracy))? ? ? ? print("Finished training network.")? ? ? ? print("Best validation accuracy of {0:.2%} obtained at iteration {1}".format(? ? ? ? ? ? best_validation_accuracy, best_iteration))? ? ? ? print("Corresponding test accuracy of {0:.2%}".format(test_accuracy))

有了之前theano的基礎(chǔ)和實現(xiàn)過LogisticRegression币狠，閱讀SGD應(yīng)該比較輕松了游两。

雖然看起來代碼比較多，但是其實邏輯很清楚和簡單漩绵，我們下面簡單的解讀一下贱案。

1. 定義損失函數(shù)cost?

l2_norm_squared = sum([(layer.w**2).sum() for layer in self.layers])? ? ? ? cost = self.layers[-1].cost(self)+\? ? ? ? ? ? ? 0.5*lmbda*l2_norm_squared/num_training_batches

出來最后一層的cost，我們還需要加上L2的normalization渐行，其實就是把所有的w平方和然后開方轰坊。注意 self.layers[-1].cost(self)，傳入的參數(shù)是Network對象【函數(shù)cost的第一個參數(shù)self是對象指針祟印，不要調(diào)用者傳入的，這里把Network對象自己(self)作為參數(shù)傳給了cost函數(shù)的net參數(shù)】粟害。

下面是SoftmaxLayer的cost函數(shù)：

def cost(self, net):? ? ? ? "Return the log-likelihood cost."? ? ? ? return -T.mean(T.log(self.output_dropout)[T.arange(net.y.shape[0]), net.y])

其實net只用到了net.y蕴忆，我們也可以把cost定義如下：

def cost(self, y):? ? ? ? "Return the log-likelihood cost."? ? ? ? return -T.mean(T.log(self.output_dropout)[T.arange(y.shape[0]), y])

然后調(diào)用的時候用

cost = self.layers[-1].cost(self.y)+\? ? ? ? ? ? ? 0.5*lmbda*l2_norm_squared/num_training_batches

我個人覺得這樣更清楚。

2. 定義梯度和updates?

grads = T.grad(cost, self.params)? ? ? ? updates =[(param, param-eta*grad)? ? ? ? ? ? ? ? ? for param, grad in zip(self.params, grads)]

3. 定義訓(xùn)練函數(shù)?

i = T.lscalar() # mini-batch index? ? ? ? train_mb = theano.function(? ? ? ? ? ? , cost, updates=updates,? ? ? ? ? ? givens={? ? ? ? ? ? ? ? self.x:? ? ? ? ? ? ? ? training_x[i*self.mini_batch_size: (i+1)*self.mini_batch_size],? ? ? ? ? ? ? ? self.y:? ? ? ? ? ? ? ? training_y[i*self.mini_batch_size: (i+1)*self.mini_batch_size]? ? ? ? ? ? })

train_mb函數(shù)的輸入是i悲幅，輸出是cost套鹅，batch的x和y通過givens制定，這和之前的Theano tutorial里的LogisticRegression一樣的汰具。cost函數(shù)用到的是最后一層的output_dropout卓鹿，從而每一層都是走計算圖的inpt_dropout->output_dropout路徑。

4. 定義validation和測試函數(shù)?

validate_mb_accuracy = theano.function(? ? ? ? ? ? , self.layers[-1].accuracy(self.y),? ? ? ? ? ? givens={? ? ? ? ? ? ? ? self.x:? ? ? ? ? ? ? ? validation_x[i*self.mini_batch_size: (i+1)*self.mini_batch_size],? ? ? ? ? ? ? ? self.y:? ? ? ? ? ? ? ? validation_y[i*self.mini_batch_size: (i+1)*self.mini_batch_size]? ? ? ? ? ? })? ? ? ? test_mb_accuracy = theano.function(? ? ? ? ? ? , self.layers[-1].accuracy(self.y),? ? ? ? ? ? givens={? ? ? ? ? ? ? ? self.x:? ? ? ? ? ? ? ? test_x[i*self.mini_batch_size: (i+1)*self.mini_batch_size],? ? ? ? ? ? ? ? self.y:? ? ? ? ? ? ? ? test_y[i*self.mini_batch_size: (i+1)*self.mini_batch_size]? ? ? ? ? ? })

輸出是最后一層的accuracy self.layers[-1].accuracy(self.y)留荔。accuracy使用的是最后一層的output吟孙，從而每一層都是用計算圖的inpt->output路徑。

5. 預(yù)測函數(shù)?

self.test_mb_predictions = theano.function(? ? ? ? ? ? , self.layers[-1].y_out,? ? ? ? ? ? givens={? ? ? ? ? ? ? ? self.x:? ? ? ? ? ? ? ? test_x[i*self.mini_batch_size: (i+1)*self.mini_batch_size]? ? ? ? ? ? })

輸出是最后一層的y_out聚蝶，也就是softmax的argmax(output)

7.4 用法?

training_data, validation_data, test_data = network3.load_data_shared()mini_batch_size = 10net = Network([ConvPoolLayer(image_shape=(mini_batch_size, 1, 28, 28),? ? ? ? ? ? ? ? ? ? ? filter_shape=(20, 1, 5, 5),? ? ? ? ? ? ? ? ? ? ? poolsize=(2, 2)),? ? ? ? FullyConnectedLayer(n_in=20*12*12, n_out=100),? ? ? ? SoftmaxLayer(n_in=100, n_out=10)], mini_batch_size)net.SGD(training_data, 60, mini_batch_size, 0.1,? ? ? ? ? ? validation_data, test_data)

至此杰妓，我們介紹了Theano的基礎(chǔ)知識以及怎么用Theano實現(xiàn)CNN。下一講將會介紹怎么自己用Python(numpy)實現(xiàn)CNN并且介紹實現(xiàn)的一些細(xì)節(jié)和性能優(yōu)化碘勉，大部分內(nèi)容來自CS231N的slides和作業(yè)assignment2，敬請關(guān)注。

最后編輯于：2017.12.06 07:52:16

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末倦西，一起剝皮案震驚了整個濱河市，隨后出現(xiàn)的幾起案子雏节，更是在濱河造成了極大的恐慌，老刑警劉巖高职，帶你破解...
沈念sama閱讀 211,123評論 6贊 490
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件矾屯，死亡現(xiàn)場離奇詭異，居然都是意外死亡初厚，警方通過查閱死者的電腦和手機(jī)件蚕，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 90,031評論 2贊 384
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門，熙熙樓的掌柜王于貴愁眉苦臉地迎上來产禾，“玉大人排作，你說我怎么就攤上這事⊙乔椋” “怎么了妄痪？”我有些...
開封第一講書人閱讀 156,723評論 0贊 345
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵，是天一觀的道長楞件。經(jīng)常有香客問我衫生，道長，這世上最難降的妖魔是什么土浸？我笑而不...
開封第一講書人閱讀 56,357評論 1贊 283
?港島之戀（遺憾婚禮）
正文為了忘掉前任罪针，我火速辦了婚禮，結(jié)果婚禮上黄伊，老公的妹妹穿的比我還像新娘泪酱。我一直安慰自己，他們只是感情好还最，可當(dāng)我...
茶點(diǎn)故事閱讀 65,412評論 5贊 384
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布墓阀。她就那樣靜靜地躺著，像睡著了一般拓轻。火紅的嫁衣襯著肌膚如雪斯撮。梳的紋絲不亂的頭發(fā)上，一...
開封第一講書人閱讀 49,760評論 1贊 289
城市分裂傳說
那天扶叉，我揣著相機(jī)與錄音勿锅，去河邊找鬼。笑死辜梳，一個胖子當(dāng)著我的面吹牛粱甫，可吹牛的內(nèi)容都是我干的。我是一名探鬼主播作瞄，決...
沈念sama閱讀 38,904評論 3贊 405
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼茶宵，長吁一口氣：“原來是場噩夢啊……” “哼！你這毒婦竟也來了宗挥？” 一聲冷哼從身側(cè)響起乌庶，我...
開封第一講書人閱讀 37,672評論 0贊 266
萬榮殺人案實錄
序言：老撾萬榮一對情侶失蹤种蝶，失蹤者是張志新（化名）和其女友劉穎，沒想到半個月后瞒大，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體螃征，經(jīng)...
沈念sama閱讀 44,118評論 1贊 303
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 36,456評論 2贊 325
?白月光啟示錄
正文我和宋清朗相戀三年透敌，在試婚紗的時候發(fā)現(xiàn)自己被綠了盯滚。大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
茶點(diǎn)故事閱讀 38,599評論 1贊 340
活死人
序言：一個原本活蹦亂跳的男人離奇死亡酗电，死狀恐怖魄藕，靈堂內(nèi)的尸體忽然破棺而出，到底是詐尸還是另有隱情撵术，我是刑警寧澤背率，帶...
沈念sama閱讀 34,264評論 4贊 328
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站嫩与，受9級特大地震影響寝姿，放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜划滋，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 39,857評論 3贊 312
男人毒藥：我在死后第九天來索命
文/蒙蒙一饵筑、第九天我趴在偏房一處隱蔽的房頂上張望。院中可真熱鬧古毛，春花似錦翻翩、人聲如沸。這莊子的主人今日做“春日...
開封第一講書人閱讀 30,731評論 0贊 21
一樁弒父案，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽胶征。三九已至塞椎，卻和暖如春，著一層夾襖步出監(jiān)牢的瞬間睛低，已是汗流浹背案狠。一陣腳步聲響...
開封第一講書人閱讀 31,956評論 1贊 264
情欲美人皮
我被黑心中介騙來泰國打工，沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留钱雷，地道東北人骂铁。一個月前我還...
沈念sama閱讀 46,286評論 2贊 360
代替公主和親
正文我出身青樓，卻偏偏與公主長得像罩抗，于是被迫代替她去往敵國和親拉庵。傳聞我的和親對象是個殘疾皇子，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 43,465評論 2贊 348

李理：Theano tutorial和卷積神經(jīng)網(wǎng)絡(luò)的Theano實現(xiàn) Part2

推薦閱讀更多精彩內(nèi)容