深度殘差收縮網(wǎng)絡(luò)事實(shí)上是一種卷積神經(jīng)網(wǎng)絡(luò)雌续,是深度殘差網(wǎng)絡(luò)(deep residual network, ResNet)的一個(gè)變種擎场。它的主體思想是灸姊,在深度學(xué)習(xí)進(jìn)行特征學(xué)習(xí)的時(shí)候,刪除冗余信息是非常重要的碎浇;這是因?yàn)樵紨?shù)據(jù)中往往存在著很多和當(dāng)前任務(wù)無(wú)關(guān)的冗余信息;軟閾值化則是一種非常靈活的璃俗、刪除冗余信息的方式奴璃。
1.深度殘差網(wǎng)絡(luò)
首先,從深度殘差網(wǎng)絡(luò)開(kāi)始講起城豁。下圖展示了深度殘差網(wǎng)絡(luò)的基本模塊苟穆,包括一些非線(xiàn)性層(殘差路徑)和一個(gè)跨層的恒等連接。恒等連接是深度殘差網(wǎng)絡(luò)的核心,是其優(yōu)異性能的一個(gè)保障雳旅。
2.深度殘差收縮網(wǎng)絡(luò)
深度殘差收縮網(wǎng)絡(luò)剖膳,就是對(duì)深度殘差網(wǎng)絡(luò)的殘差路徑進(jìn)行收縮的一種網(wǎng)絡(luò)。這里的“收縮”指的就是軟閾值化岭辣。
軟閾值化是許多信號(hào)降噪方法的核心步驟吱晒,它是將接近于零(或者說(shuō)絕對(duì)值低于某一閾值τ)的特征置為0,也就是將[-τ, τ]區(qū)間內(nèi)的特征置為0沦童,讓其他的仑濒、距0較遠(yuǎn)的特征也朝著0進(jìn)行收縮。
如果和前一個(gè)卷積層的偏置b放在一起看的話(huà)偷遗,這個(gè)置為零的區(qū)間就變成了[-τ+b, τ+b]墩瞳。因?yàn)棣雍蚥都是可以自動(dòng)學(xué)習(xí)得到的參數(shù),這個(gè)角度看的話(huà)氏豌,軟閾值化其實(shí)是可以將任意區(qū)間的特征置為零喉酌,是一種更靈活的、刪除某個(gè)取值范圍特征的方式泵喘,也可以理解成一種更靈活的非線(xiàn)性映射泪电。
從另一個(gè)方面來(lái)看,前面的兩個(gè)卷積層纪铺、兩個(gè)批標(biāo)準(zhǔn)化和兩個(gè)激活函數(shù)相速,將冗余信息的特征,變換成接近于零的值鲜锚;將有用的特征突诬,變換成遠(yuǎn)離零的值。之后芜繁,通過(guò)自動(dòng)學(xué)習(xí)得到一組閾值旺隙,利用軟閾值化將冗余特征剔除掉,將有用特征保留下來(lái)骏令。
通過(guò)堆疊一定數(shù)量的基本模塊蔬捷,可以構(gòu)成完整的深度殘差收縮網(wǎng)絡(luò),如下圖所示:
3.圖像識(shí)別及Keras編程
雖然深度殘差收縮網(wǎng)絡(luò)原本是應(yīng)用在基于振動(dòng)信號(hào)的故障診斷伏社,但是深度殘差收縮網(wǎng)絡(luò)事實(shí)上是一種通用的特征學(xué)習(xí)方法抠刺,相信在很多任務(wù)(計(jì)算機(jī)視覺(jué)、語(yǔ)音摘昌、文本)中都可能有著一定的用處速妖。
下面是基于深度殘差收縮網(wǎng)絡(luò)的MNIST手寫(xiě)數(shù)字圖像識(shí)別代碼(代碼很簡(jiǎn)單,僅供參考):
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Sat Dec 28 23:24:05 2019
Implemented using TensorFlow 1.0.1 and Keras 2.2.1
M. Zhao, S. Zhong, X. Fu, et al., Deep Residual Shrinkage Networks for Fault Diagnosis,
IEEE Transactions on Industrial Informatics, 2019, DOI: 10.1109/TII.2019.2943898
@author: me
"""
from __future__ import print_function
import keras
import numpy as np
from keras.datasets import mnist
from keras.layers import Dense, Conv2D, BatchNormalization, Activation
from keras.layers import AveragePooling2D, Input, GlobalAveragePooling2D
from keras.optimizers import Adam
from keras.regularizers import l2
from keras import backend as K
from keras.models import Model
from keras.layers.core import Lambda
K.set_learning_phase(1)
# Input image dimensions
img_rows, img_cols = 28, 28
# The data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()
if K.image_data_format() == 'channels_first':
x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
input_shape = (1, img_rows, img_cols)
else:
x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
# Noised data
x_train = x_train.astype('float32') / 255. + 0.5*np.random.random([x_train.shape[0], img_rows, img_cols, 1])
x_test = x_test.astype('float32') / 255. + 0.5*np.random.random([x_test.shape[0], img_rows, img_cols, 1])
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)
def abs_backend(inputs):
return K.abs(inputs)
def expand_dim_backend(inputs):
return K.expand_dims(K.expand_dims(inputs,1),1)
def sign_backend(inputs):
return K.sign(inputs)
def pad_backend(inputs, in_channels, out_channels):
pad_dim = (out_channels - in_channels)//2
return K.spatial_3d_padding(inputs, padding = ((0,0),(0,0),(pad_dim,pad_dim)))
# Residual Shrinakge Block
def residual_shrinkage_block(incoming, nb_blocks, out_channels, downsample=False,
downsample_strides=2):
residual = incoming
in_channels = incoming.get_shape().as_list()[-1]
for i in range(nb_blocks):
identity = residual
if not downsample:
downsample_strides = 1
residual = BatchNormalization()(residual)
residual = Activation('relu')(residual)
residual = Conv2D(out_channels, 3, strides=(downsample_strides, downsample_strides),
padding='same', kernel_initializer='he_normal',
kernel_regularizer=l2(1e-4))(residual)
residual = BatchNormalization()(residual)
residual = Activation('relu')(residual)
residual = Conv2D(out_channels, 3, padding='same', kernel_initializer='he_normal',
kernel_regularizer=l2(1e-4))(residual)
# Calculate global means
residual_abs = Lambda(abs_backend)(residual)
abs_mean = GlobalAveragePooling2D()(residual_abs)
# Calculate scaling coefficients
scales = Dense(out_channels, activation=None, kernel_initializer='he_normal',
kernel_regularizer=l2(1e-4))(abs_mean)
scales = BatchNormalization()(scales)
scales = Activation('relu')(scales)
scales = Dense(out_channels, activation='sigmoid', kernel_regularizer=l2(1e-4))(scales)
scales = Lambda(expand_dim_backend)(scales)
# Calculate thresholds
thres = keras.layers.multiply([abs_mean, scales])
# Soft thresholding
sub = keras.layers.subtract([residual_abs, thres])
zeros = keras.layers.subtract([sub, sub])
n_sub = keras.layers.maximum([sub, zeros])
residual = keras.layers.multiply([Lambda(sign_backend)(residual), n_sub])
# Downsampling (it is important to use the pooL-size of (1, 1))
if downsample_strides > 1:
identity = AveragePooling2D(pool_size=(1,1), strides=(2,2))(identity)
# Zero_padding to match channels (it is important to use zero padding rather than 1by1 convolution)
if in_channels != out_channels:
identity = Lambda(pad_backend)(identity, in_channels, out_channels)
residual = keras.layers.add([residual, identity])
return residual
# define and train a model
inputs = Input(shape=input_shape)
net = Conv2D(8, 3, padding='same', kernel_initializer='he_normal', kernel_regularizer=l2(1e-4))(inputs)
net = residual_shrinkage_block(net, 1, 8, downsample=True)
net = BatchNormalization()(net)
net = Activation('relu')(net)
net = GlobalAveragePooling2D()(net)
outputs = Dense(10, activation='softmax', kernel_initializer='he_normal', kernel_regularizer=l2(1e-4))(net)
model = Model(inputs=inputs, outputs=outputs)
model.compile(loss='categorical_crossentropy', optimizer=Adam(), metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=100, epochs=5, verbose=1, validation_data=(x_test, y_test))
# get results
K.set_learning_phase(0)
DRSN_train_score = model.evaluate(x_train, y_train, batch_size=100, verbose=0)
print('Train loss:', DRSN_train_score[0])
print('Train accuracy:', DRSN_train_score[1])
DRSN_test_score = model.evaluate(x_test, y_test, batch_size=100, verbose=0)
print('Test loss:', DRSN_test_score[0])
print('Test accuracy:', DRSN_test_score[1])
為方便對(duì)比聪黎,深度殘差網(wǎng)絡(luò)的代碼如下:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Sat Dec 28 23:19:03 2019
Implemented using TensorFlow 1.0 and Keras 2.2.1
K. He, X. Zhang, S. Ren, J. Sun, Deep Residual Learning for Image Recognition, CVPR, 2016.
@author: me
"""
from __future__ import print_function
import numpy as np
import keras
from keras.datasets import mnist
from keras.layers import Dense, Conv2D, BatchNormalization, Activation
from keras.layers import AveragePooling2D, Input, GlobalAveragePooling2D
from keras.optimizers import Adam
from keras.regularizers import l2
from keras import backend as K
from keras.models import Model
from keras.layers.core import Lambda
K.set_learning_phase(1)
# input image dimensions
img_rows, img_cols = 28, 28
# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()
if K.image_data_format() == 'channels_first':
x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
input_shape = (1, img_rows, img_cols)
else:
x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
# Noised data
x_train = x_train.astype('float32') / 255. + 0.5*np.random.random([x_train.shape[0], img_rows, img_cols, 1])
x_test = x_test.astype('float32') / 255. + 0.5*np.random.random([x_test.shape[0], img_rows, img_cols, 1])
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)
def pad_backend(inputs, in_channels, out_channels):
pad_dim = (out_channels - in_channels)//2
return K.spatial_3d_padding(inputs, padding = ((0,0),(0,0),(pad_dim,pad_dim)))
def residual_block(incoming, nb_blocks, out_channels, downsample=False,
downsample_strides=2):
residual = incoming
in_channels = incoming.get_shape().as_list()[-1]
for i in range(nb_blocks):
identity = residual
if not downsample:
downsample_strides = 1
residual = BatchNormalization()(residual)
residual = Activation('relu')(residual)
residual = Conv2D(out_channels, 3, strides=(downsample_strides, downsample_strides),
padding='same', kernel_initializer='he_normal',
kernel_regularizer=l2(1e-4))(residual)
residual = BatchNormalization()(residual)
residual = Activation('relu')(residual)
residual = Conv2D(out_channels, 3, padding='same', kernel_initializer='he_normal',
kernel_regularizer=l2(1e-4))(residual)
# Downsampling (it is important to use the pooL-size of (1, 1))
if downsample_strides > 1:
identity = AveragePooling2D(pool_size=(1, 1), strides=(2, 2))(identity)
# Zero_padding to match channels (it is important to use zero padding rather than 1by1 convolution)
if in_channels != out_channels:
identity = Lambda(pad_backend)(identity, in_channels, out_channels)
residual = keras.layers.add([residual, identity])
return residual
# define and train a model
inputs = Input(shape=input_shape)
net = Conv2D(8, 3, padding='same', kernel_initializer='he_normal', kernel_regularizer=l2(1e-4))(inputs)
net = residual_block(net, 1, 8, downsample=True)
net = BatchNormalization()(net)
net = Activation('relu')(net)
net = GlobalAveragePooling2D()(net)
outputs = Dense(10, activation='softmax', kernel_initializer='he_normal', kernel_regularizer=l2(1e-4))(net)
model = Model(inputs=inputs, outputs=outputs)
model.compile(loss='categorical_crossentropy', optimizer=Adam(), metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=100, epochs=5, verbose=1, validation_data=(x_test, y_test))
# get results
K.set_learning_phase(0)
resnet_train_score = model.evaluate(x_train, y_train, batch_size=100, verbose=0)
print('Train loss:', resnet_train_score[0])
print('Train accuracy:', resnet_train_score[1])
resnet_test_score = model.evaluate(x_test, y_test, batch_size=100, verbose=0)
print('Test loss:', resnet_test_score[0])
print('Test accuracy:', resnet_test_score[1])
備注:
(1)深度殘差收縮網(wǎng)絡(luò)的結(jié)構(gòu)比普通的深度殘差網(wǎng)絡(luò)復(fù)雜罕容,或許更難訓(xùn)練备恤。
(2)程序里只設(shè)置了一個(gè)基本模塊,在更復(fù)雜的數(shù)據(jù)集上锦秒,可適當(dāng)增加露泊。
(3)如果遇到這個(gè)TypeError:softmax() got an unexpected keyword argument ‘a(chǎn)xis’,就點(diǎn)開(kāi)tensorflow_backend.py旅择,將return tf.nn.softmax(x, axis=axis)中的第一個(gè)axis改成dim即可惭笑。
轉(zhuǎn)載網(wǎng)址:
https://blog.csdn.net/zmh1250329863/article/details/103761091
參考文獻(xiàn):
M. Zhao, S. Zhong, X. Fu, et al., Deep residual shrinkage networks for fault diagnosis, IEEE Transactions on Industrial Informatics, 2019, DOI: 10.1109/TII.2019.2943898