論文閱讀筆記：經(jīng)典論文-可視化和理解卷積神經(jīng)網(wǎng)絡(luò)

閱讀目標(biāo)：

文中如何可視化網(wǎng)絡(luò)的？
通過可視化網(wǎng)絡(luò)谊惭，作者理解了哪些信息圈盔？

一悄雅、Introduction

在related work里介紹了兩個概念：Visualization和Feature Generalization宽闲。
之前我寫過一篇feature visualization的文章握牧，這里有提到娩梨。那種方法的缺點(diǎn)是requires a careful initialization and does not give any information about the unit’s invariances.
這里提到一個詞叫做”unit’s invariances“，表面含義是單元的不變形。
這里作者對自己的網(wǎng)絡(luò)的特點(diǎn)有一個概括：

they are not just crops of input images, but rather top-down projections that reveal structures within each patch that stimulate a particular feature map.

Feature generalization具體是指：
the generalization ability of convnet features

二掸冤、Approach

(一)友雳、所用模型介紹

用到的是standard fully supervised convnet models。
彩色2D輸入——>C類概率
每一層包含：
(i)卷積層
(ii)relu層: a rectified linear function (relu(x)= max(x, 0))
(iii)[optionally] 最大池化層
(iv)[optionally]局部歸一化a local contrast operation that normalizes the responses across feature maps
前幾層是全連接卷積層層押赊，最后一層是一個softmax分類器流礁。

(二) 訓(xùn)練過程

訓(xùn)練集：{x, y}
損失函數(shù)：cross-entropy loss function
比較yi^和yi
訓(xùn)練過程描述：
（這句話寫的蠻好的，我就復(fù)制粘貼過來了）

The parameters of the network (filters in the convolutional layers, weight matrices in the fully- connected layers and biases) are trained by back-propagating the derivative of the loss with respect to the parameters throughout the network, and updating the parameters via stochastic gradient descent. Details

(三)再姑、可視化的方法

目標(biāo)：理解the feature activity in 中間層（intermediate layers）
做法概括：map these activities backto the input pixel space找御，用一個解卷積網(wǎng)絡(luò)去實現(xiàn)這樣的映射
解卷積可以理解為卷積的逆向操作(filtering, pooling)
具體做法：
convet的每一層都鏈接了一個deconvnet霎桅。
如果要看某一個convnet的activation，我們可以把這一層的其他activation都設(shè)為0遇革，然后把這些feature maps輸入到attached deconvnet layer瓜浸。
進(jìn)而進(jìn)行(i) unpool:
max pooling其實是不可逆的插佛，但是我們在”switch“這個變量中，記錄每一個pooling region的最大值的位置雇寇。在解卷積網(wǎng)絡(luò)中氢拥，unpooling操作用這些”switches”去把layer reconstructions放到合適的位置蚌铜。
(ii) rectify: RELU
確保feature map always positive
(iii) filter:
這一步是卷積的逆向操作。
approximately invert嫩海。
用filter的轉(zhuǎn)置冬殃，并且用于rectified maps而不是
用于the output of the layer。

In practice this means flipping each filter vertically and horizontally.

這三步需要重復(fù)until input pixel space is reached叁怪。

原理圖

最后編輯于：2020.08.20 16:32:14

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者