摘要
MobileNet網(wǎng)絡(luò)一種針對(duì)移動(dòng)端以及嵌入式視覺應(yīng)用的輕量網(wǎng)絡(luò)結(jié)構(gòu)险污,作者來自Google猪瞬。其貢獻(xiàn)在于使用深度可分離卷積和1x1卷積代替?zhèn)鹘y(tǒng)的2d圖像卷積憎瘸,來構(gòu)造輕型權(quán)重深度神經(jīng)網(wǎng)絡(luò)。在資源和準(zhǔn)確率的權(quán)衡方面做了大量的實(shí)驗(yàn)并且相較于其他在ImageNet分類任務(wù)上著名的模型有很好的表現(xiàn)陈瘦。
網(wǎng)絡(luò)由來
網(wǎng)絡(luò)小型化方法:
(1)卷積核分解幌甘,使用1×N和N×1的卷積核代替N×N的卷積核
(2)使用bottleneck結(jié)構(gòu),以SqueezeNet為代表
(3)以低精度浮點(diǎn)數(shù)保存,例如Deep Compression
(4)冗余卷積核剪枝及哈弗曼編碼-
傳統(tǒng)3D圖像卷積
傳統(tǒng)的3D圖像卷積指的是一個(gè)多通道的圖像(假設(shè)圖像通道數(shù)為M锅风,C>1) 和N個(gè)KxK的卷積核(這N個(gè)卷積核互不相同)做卷積酥诽,即深度神經(jīng)網(wǎng)絡(luò)中的卷積層所作的事情。卷積的流程是:
(1)對(duì)于其中一個(gè)卷積核皱埠,分別與圖像的每個(gè)通道做2d圖像卷積肮帐,M個(gè)通道得到M個(gè)卷積結(jié)果;
(2)對(duì)這M個(gè)卷積結(jié)果按元素求和边器,得到一張求和結(jié)果训枢;
(3)對(duì)N個(gè)卷積核中的每個(gè)核,重復(fù)步驟(1)-(2)忘巧,即得到一個(gè)通道為N的卷積結(jié)果恒界,即featuremap。
該過程的計(jì)算量為:((K x K x H x W) x M )x N
-
深度可分離卷積
深度可分離卷積(depthwith conv) 實(shí)際上就是傳統(tǒng)3D卷積過程中的步驟(1)砚嘴,將1個(gè)卷積核分別與每個(gè)通道進(jìn)行卷積十酣,得到M個(gè)卷積結(jié)果
-
1x1 卷積
1x1卷積最初來自于Network in Network
網(wǎng)絡(luò),主要用于通道壓縮上际长,即改變特征圖的通道數(shù)耸采,后也用來代替全連接層以減少計(jì)算量。
對(duì)于一個(gè)通道數(shù)為M的圖像工育,N個(gè)1x1卷積的計(jì)算量為:((1 x 1 x H x W) x M) x N
mobilenet方案
對(duì)于一個(gè)標(biāo)準(zhǔn)的卷積操作洋幻,可用一個(gè)深度可分離卷積核一個(gè)1x1卷積替代,其計(jì)算量為:
(K x K x H x W) x M + ((1 x 1 x H x W) x M) x N = H x W x (K x K + N) x M
相比與標(biāo)準(zhǔn)的卷積操作翅娶,其計(jì)算量減少量為:
(H x W x M x (K x K + N) )/(H x W x K x K x M x N) = 1/N + 1/(K x K)
網(wǎng)絡(luò)結(jié)構(gòu)
Mobilenet的網(wǎng)絡(luò)結(jié)構(gòu)非常簡單文留,第一層采用3x3標(biāo)準(zhǔn)的卷積層(stride=2),其后采用深度可分離卷積和1x1卷積(即conv_dw + conv1x1
)作為基礎(chǔ)單元竭沫,若干個(gè) 這樣的基礎(chǔ)單元串聯(lián)起來形成不同深度的網(wǎng)絡(luò)燥翅,其中每個(gè)1x1卷積后都連接relu激活和batch norm,在conv_dw通過設(shè)置stride=2
進(jìn)行下采樣蜕提。最后采用average pooling
代替全連接森书,ImageNet分類任務(wù)上采用一層全連接(units=1000) 和softmax輸出類別和置信概率。
mobile net structs
--------------------------------------------------
layer | kh x kw, out, s | out size
--------------------------------------------------
input image (224 x 224 x3)
--------------------------------------------------
conv | 3x3, 32, 2 | 112x112x32
--------------------------------------------------
conv_dw | 3x3, 32dw, 1 | 112x112x32
conv1x1 | 1x1, 64, 1 | 112x112x64
--------------------------------------------------
conv_dw | 3x3, 64dw, 2 | 56x56x64
conv1x1 | 1x1, 128, 1 | 56x56x128
--------------------------------------------------
conv_dw | 3x3, 128dw, 1 | 56x56x128
conv1x1 | 1x1, 128, 1 | 56x56x128
--------------------------------------------------
conv_dw | 3x3, 128dw, 2 | 28x28x128
conv1x1 | 1x1, 256, 1 | 28x28x128
--------------------------------------------------
conv_dw | 3x3, 256dw, 1 | 28x28x256
conv1x1 | 1x1, 256, 1 | 28x28x256
--------------------------------------------------
conv_dw | 3x3, 256dw, 2 | 14x14x256
conv1x1 | 1x1, 512, 1 | 14x14x512
--------------------------------------------------
5x
conv_dw | 3x3, 512dw, 1 | 14x14x512
conv1x1 | 1x1, 512, 1 | 14x14x512
--------------------------------------------------
conv_dw | 3x3, 512dw, 2 | 7x7x512
conv1x1 | 1x1, 1024, 1 | 7x7x1024
--------------------------------------------------
conv_dw | 3x3, 1024dw, 1 | 7x7x1024
conv1x1 | 1x1, 1024, 1 | 7x7x1024
--------------------------------------------------
Avg Pool | 7x7, 1 | 1x1x1024
FC | 1024, 1000 | 1x1x1000
Softmax | Classifier | 1x1x1000
--------------------------------------------------
代碼實(shí)現(xiàn)
本文采用tensorflow.contrib.layers 模塊來構(gòu)建Mobilenet網(wǎng)絡(luò)結(jié)構(gòu)谎势,關(guān)于tf.nn凛膏,tf.layers等api的構(gòu)建方式參見VGG網(wǎng)絡(luò)中的相關(guān)代碼。
# --------------------------Method 1 --------------------------------------------
import tensorflow.contrib.layers as tcl
from tensorflow.contrib.framework import arg_scope
class Mobilenet:
def __init__(self, resolution_inp=224, channel=3, name='resnet50'):
self.name = name
self.channel = channel
self.resolution_inp = resolution_inp
def _depthwise_separable_conv(self, x, num_outputs, kernel_size=3, stride=1, scope=None):
with tf.variable_scope(scope, "dw_blk"):
dw_conv = tcl.separable_conv2d(x, num_outputs=None,
kernel_size=kernel_size,
stride=stride,
depth_multiplier=1)
conv_1x1 = tcl.conv2d(dw_conv, num_outputs=num_outputs, kernel_size=1, stride=1)
return conv_1x1
def __call__(self, x, dropout=0.5, is_training=True):
with tf.variable_scope(self.name) as scope:
with arg_scope([tcl.batch_norm], is_training=is_training, scale=True):
with arg_scope([tcl.conv2d, tcl.separable_conv2d],
activation_fn=tf.nn.relu,
normalizer_fn=tcl.batch_norm,
padding="SAME"):
conv1 = tcl.conv2d(x, 32, kernel_size=3, stride=2)
y = self._depthwise_separable_conv(conv1, 64, 3, stride=1)
y = self._depthwise_separable_conv(y, 128, 3, stride=2)
y = self._depthwise_separable_conv(y, 128, 3, stride=1)
y = self._depthwise_separable_conv(y, 256, 3, stride=2)
y = self._depthwise_separable_conv(y, 256, 3, stride=1)
y = self._depthwise_separable_conv(y, 512, 3, stride=2)
y = self._depthwise_separable_conv(y, 512, 3, stride=1)
y = self._depthwise_separable_conv(y, 512, 3, stride=1)
y = self._depthwise_separable_conv(y, 512, 3, stride=1)
y = self._depthwise_separable_conv(y, 512, 3, stride=1)
y = self._depthwise_separable_conv(y, 512, 3, stride=1)
print("y", y)
y = self._depthwise_separable_conv(y, 512, 3, stride=2)
y = self._depthwise_separable_conv(y, 512, 3, stride=1)
avg_pool = tcl.avg_pool2d(y, 7, stride=1)
flatten = tf.layers.flatten(avg_pool)
self.fc6 = tf.layers.dense(flatten, units=1000, activation=tf.nn.relu)
# dropout = tf.nn.dropout(fc6, keep_prob=0.5)
predictions = tf.nn.softmax(self.fc6)
return predictions
運(yùn)行
該部分代碼包含2部分:計(jì)時(shí)函數(shù)time_tensorflow_run
接受一個(gè)tf.Session
變量和待計(jì)算的tensor
以及相應(yīng)的參數(shù)字典和打印信息, 統(tǒng)計(jì)執(zhí)行該tensor
100次所需要的時(shí)間(平均值和方差)脏榆;主函數(shù) run_benchmark中初始化了vgg16的3種調(diào)用方式猖毫,分別統(tǒng)計(jì)3中網(wǎng)絡(luò)在推理(predict) 和梯度計(jì)算(后向傳遞)的時(shí)間消耗,詳細(xì)代碼如下:
# -------------------------- Demo and Test --------------------------------------------
batch_size = 16
num_batches = 100
import time
import math
from datetime import datetime
def time_tensorflow_run(session, target, feed, info_string):
"""
calculate time for each session run
:param session: tf.Session
:param target: opterator or tensor need to run with session
:param feed: feed dict for session
:param info_string: info message for print
:return:
"""
num_steps_burn_in = 10 # 預(yù)熱輪數(shù)
total_duration = 0.0 # 總時(shí)間
total_duration_squared = 0.0 # 總時(shí)間的平方和用以計(jì)算方差
for i in range(num_batches + num_steps_burn_in):
start_time = time.time()
_ = session.run(target, feed_dict=feed)
duration = time.time() - start_time
if i >= num_steps_burn_in: # 只考慮預(yù)熱輪數(shù)之后的時(shí)間
if not i % 10:
print('[%s] step %d, duration = %.3f' % (datetime.now(), i - num_steps_burn_in, duration))
total_duration += duration
total_duration_squared += duration * duration
mn = total_duration / num_batches # 平均每個(gè)batch的時(shí)間
vr = total_duration_squared / num_batches - mn * mn # 方差
sd = math.sqrt(vr) # 標(biāo)準(zhǔn)差
print('[%s] %s across %d steps, %.3f +/- %.3f sec/batch' % (datetime.now(), info_string, num_batches, mn, sd))
# test demo
def run_benchmark():
"""
main function for test or demo
:return:
"""
with tf.Graph().as_default():
image_size = 224 # 輸入圖像尺寸
images = tf.Variable(tf.random_normal([batch_size, image_size, image_size, 3], dtype=tf.float32, stddev=1e-1))
# method 0
# prediction, fc = resnet50(images, training=True)
model = Mobilenet(224, 3)
prediction = model(images, is_training=True)
fc = model.fc6
params = tf.trainable_variables()
for v in params:
print(v)
init = tf.global_variables_initializer()
print("out shape ", prediction)
sess = tf.Session()
print("init...")
sess.run(init)
print("predict..")
writer = tf.summary.FileWriter("./logs")
writer.add_graph(sess.graph)
time_tensorflow_run(sess, prediction, {}, "Forward")
# 用以模擬訓(xùn)練的過程
objective = tf.nn.l2_loss(fc) # 給一個(gè)loss
grad = tf.gradients(objective, params) # 相對(duì)于loss的 所有模型參數(shù)的梯度
print('grad backword')
time_tensorflow_run(sess, grad, {}, "Forward-backward")
writer.close()
if __name__ == '__main__':
run_benchmark()
注: 完整代碼可參見個(gè)人github工程
參數(shù)量
時(shí)間效率
參考
https://blog.csdn.net/wfei101/article/details/78310226
https://blog.csdn.net/u013709270/article/details/78722985
1x1卷積