對(duì)VGG16 這類keras自帶的網(wǎng)絡(luò)分析有感,寫在這里.
查看VGG16在keras中的說(shuō)明文檔,可以這樣:
from keras.applications.vgg16 import VGG16
然后(在jupyter notebook, jupyter lab或Ipython中)
? VGG16
可查看VGG16的使用幫助.
Signature: VGG16(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
Docstring:
Instantiates the VGG16 architecture.
Optionally loads weights pre-trained on ImageNet. Note that when using TensorFlow, for best performance you should set `image_data_format='channels_last'` in your Keras config at ~/.keras/keras.json.
翻譯:
可以加載在IMAGENET上預(yù)訓(xùn)練的權(quán)值. 當(dāng)使用tensorflow作為backend時(shí), 應(yīng)該在keras.json中設(shè)置" `image_data_format='channels_last'.
The model and the weights are compatible with both TensorFlow and Theano. The data format convention used by the model is the one specified in your Keras config file.
翻譯:
模型和權(quán)重文件在tensorflow和theano backend下都兼容. 但是數(shù)據(jù)格式的習(xí)慣需要在keras config文件中設(shè)置(如上).
# Arguments 參數(shù)介紹:
include_top: whether to include the 3 fully-connected layers at the top of the network.
weights: one of `None` (random initialization), 'imagenet' (pre-training on ImageNet),
or the path to the weights file to be loaded.
input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)
to use as image input for the model.
input_shape: optional shape tuple, only to be specified
if `include_top` is False (otherwise the input shape
has to be `(224, 224, 3)` (with `channels_last` data format)
or `(3, 224, 224)` (with `channels_first` data format).
It should have exactly 3 input channels,
and width and height should be no smaller than 48.
E.g. `(200, 200, 3)` would be one valid value.
pooling: Optional pooling mode for feature extraction
when `include_top` is `False`.
- `None` means that the output of the model will be
the 4D tensor output of the
last convolutional layer.
- `avg` means that global average pooling
will be applied to the output of the
last convolutional layer, and thus
the output of the model will be a 2D tensor.
- `max` means that global max pooling will
be applied.
classes: optional number of classes to classify images
into, only to be specified if `include_top` is True, and
if no `weights` argument is specified.
# Returns
A Keras model instance.
# Raises
ValueError: in case of invalid argument for `weights`,
or invalid input shape.
File: c:\anaconda3\lib\site-packages\keras-2.1.5-py3.6.egg\keras\applications\vgg16.py
Type: function
- include_top: boolean (True or False)
是否包含最上層的全連接層. 因?yàn)閂GGNET最后有三個(gè)全連接層, 因此,這個(gè)選項(xiàng)表示是否需要最上面的三個(gè)全連接層. 一般網(wǎng)絡(luò)最后都會(huì)有全連接層, 最后一個(gè)全連接層更是設(shè)定了分類的個(gè)數(shù), loss的計(jì)算方法, 并架設(shè)了一個(gè)概率轉(zhuǎn)換函數(shù)(soft max). 其實(shí)soft max的作用就是將輸出轉(zhuǎn)換為各類別的概率,并計(jì)算loss.
可以這么說(shuō), 最上面三層使用來(lái)進(jìn)行分類的, 其余層使用來(lái)進(jìn)行特征提取的. 因此如果include_top=False,也就表示這個(gè)網(wǎng)絡(luò)只能進(jìn)行特征提取. 不能在進(jìn)行新的訓(xùn)練或者在已有權(quán)重上fine-tune. - weights: 'None' / 'imagenet' / path (to the weight file)
None表示沒有指定權(quán)重,對(duì)網(wǎng)絡(luò)參數(shù)進(jìn)行隨機(jī)初始化.
'imagenet' 表示加載imagenet與訓(xùn)練的網(wǎng)絡(luò)權(quán)重.
'path' 表示指向權(quán)重文件的路徑.
VGG16 的框架是確定的, 而其權(quán)重參數(shù)的個(gè)數(shù)和結(jié)構(gòu)完全由輸入決定.
如果weight = None, 則輸入尺寸可以任意指定,(范圍不得小于48, 否則最后一個(gè)卷積層沒有輸出).
如果 weight = 'imagenet', 則輸入尺寸必須嚴(yán)格等于(224,224), 權(quán)重的規(guī)模和結(jié)構(gòu)有出入唯一決定, 使用了imagenet的權(quán)重,就必須使用訓(xùn)練時(shí)所對(duì)應(yīng)的輸入, 否則第一個(gè)全連接層的輸入對(duì)接不上. (例如, 原來(lái)網(wǎng)絡(luò)最后一個(gè)卷基層的輸出為 300, 全連接層的神經(jīng)元有1000個(gè),則這里權(quán)重的結(jié)構(gòu)為300X1000), 而其他的出入不能保證卷基層輸出為300, 則對(duì)接不上會(huì)報(bào)錯(cuò)).
如果 weight = 'path', 則輸入必須和path對(duì)應(yīng)權(quán)值文件訓(xùn)練時(shí)的輸入保持一致. - input_tensor: 圖片tonsor輸入項(xiàng)
- input_shape: tuple
如果include_top = False(表示用網(wǎng)絡(luò)進(jìn)行特征提取), 此時(shí)需要指定輸入圖片尺寸. 如果include_top = True(表示網(wǎng)路被用來(lái)進(jìn)行重新訓(xùn)練或fine-tune), 則圖片輸入尺寸必須在有效范圍內(nèi)(width & height 大于48)或和加載權(quán)重訓(xùn)練時(shí)的輸入保持一致. - pooling: 當(dāng)include_top = False(網(wǎng)絡(luò)被用于特征提取時(shí)改參數(shù)有效)
(純自己理解, 可能有誤).
最后一個(gè)卷基層的輸出應(yīng)該是一個(gè)4D的向量.(M,1,w',h'), 其中w'和h'表示卷積過(guò)后得到的基本尺寸. 可以這樣想象, 待卷積的目標(biāo)是一個(gè)(N, w, h)的矩陣. 每卷積一次都是在這個(gè)矩陣的(n, w,h)上進(jìn)行卷積, n表示卷積核的深度(2D=2, 3D=3). 最后依然會(huì)得到(M, w',h')這樣一個(gè)維度的矩陣作為卷基層的輸出. 把每一個(gè)2D的(w', h')看做一個(gè)維度, 那么最終輸出就是4D的(M,1,w',h').那么:
pooling = None, 表示對(duì)輸出的特征不作處理,依然是4D的.
pooling = 'avg', 表示在M維度進(jìn)行平均, 最終得到的是一個(gè)(1,1,w',h')的特征輸出.
pooling = 'max', 亦然. - classes: 要訓(xùn)練的類別數(shù). 僅當(dāng)include_top = True, 沒有'weights'參數(shù)給定.(表示訓(xùn)練一個(gè)新網(wǎng)絡(luò))