今天聽師弟的匯報浩习,講了熱力圖的原理静暂,一直想去學(xué)習(xí),一直沒提上日程谱秽,特此記錄今日學(xué)習(xí)內(nèi)容洽蛀。
感謝師弟的分享!
論文鏈接:
Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization | SpringerLink
-代碼:
GitHub - jacobgil/pytorch-grad-cam: PyTorch implementation of Grad-CAM
Grad-CAM(Gradient-weighted Class Activation Map), 指對輸入圖像生成類激活的熱力圖疟赊。它是與特定輸出類別相關(guān)的二維特征分?jǐn)?shù)網(wǎng)絡(luò)郊供,網(wǎng)格的每個位置表示該類別的重要程度。對于一張輸入到CNN模型且被分類成“狗”的圖片近哟,該技術(shù)可以以熱力圖形式呈現(xiàn)圖片中每個位置與“狗”類的相似程度驮审。有助于了解一張原始圖像的哪一個局部位置讓CNN模型做出了最終的分類決策。
相關(guān)圖
核心公式
相關(guān)步驟
- 模型輸入
from keras.applications.vgg16 import VGG16
# 特別注意,在之前的實驗中疯淫,我們都把頂層的分類器丟棄掉了地来,include_top = False
model = VGG16(weights='imagenet')
print("模型調(diào)取成功")
- 數(shù)據(jù)輸入
from keras import backend as K
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input, decode_predictions
import numpy as np
# The local path to our target image
img_path = '/home/som/lab/rongqian/hangtian/grad-cam/data/timgqqq.jpg'
# `img` is a PIL image of size 224x224
img = image.load_img(img_path, target_size=(224, 224))
# 一轉(zhuǎn),`x` is a float32 Numpy array of shape (224, 224, 3)
x0 = image.img_to_array(img)
# 二擴(kuò)熙掺,We add a dimension to transform our array into a "batch"
# of size (1, 224, 224, 3)
x1 = np.expand_dims(x0, axis=0)
# 三標(biāo)未斑,F(xiàn)inally we preprocess the batch
# (this does channel-wise color normalization)
x = preprocess_input(x1)
- 結(jié)果輸出
preds = model.predict(x)
print('Predicted:', decode_predictions(preds, top=3)[0])
num = np.argmax(preds)#求最大的類別的索引
- 求熱力圖矩陣
african_elephant_output = model.output[:, num]#獲取索引為num的類的預(yù)測輸出 shape: (batch_size,)
last_conv_layer = model.get_layer('block5_conv3')#獲取最后一個卷積層激活輸出 shape (batch_size, 14, 14, 512)
grads = K.gradients(african_elephant_output, last_conv_layer.output)[0]#求模型輸出針對最后一個卷積層激活輸出的梯度 shape(batch_size,14,14,512)
#梯度均值化,即求各通道平均值币绩,平均數(shù),即對每一層 14 x 14的矩陣求均值, (batch_size,14,14, 512) ----> (512,)
pooled_grads = K.mean(grads, axis=(0, 1, 2))
print('pooled_grads:',pooled_grads.shape)
#建立模型輸出蜡秽、最后一個卷積層激活輸出、梯度均值三者之間的函數(shù)關(guān)系
iterate = K.function([model.input], [pooled_grads, last_conv_layer.output[0]])
# 以真實的數(shù)據(jù)作為輸入类浪,得到結(jié)果
pooled_grads_value, conv_layer_output_value = iterate([x])
print(pooled_grads_value.shape,conv_layer_output_value.shape)#(512,) (14, 14, 512)
##乘梯度
# We multiply each channel in the feature map array
# by "how important this channel is" with regard to the elephant class
#表征出最后卷積層激活輸出各點對決策模型分類的重要程度载城。
for i in range(len(pooled_grads_value)):
conv_layer_output_value[:, :, i] *= pooled_grads_value[i]
# The channel-wise mean of the resulting feature map
# is our heatmap of class activation
heatmap = np.mean(conv_layer_output_value, axis=-1) # #shape:14*14
#Relu函數(shù)
heatmap = np.maximum(heatmap, 0)
#歸一化處理
heatmap /= np.max(heatmap) #shape:14*14
- 4.畫熱力圖
import matplotlib.pyplot as plt
plt.matshow(heatmap)
plt.show()
- 5.熱力圖與原圖融合
#讀取原始圖像
import cv2
test = cv2.imread("/home/som/lab/rongqian/hangtian/grad-cam/data/timgqqq.jpg")
#heatmap為[0,1]之間的浮點數(shù),特別注意:cv2.resize(img, (x軸向長度费就,y軸向長度))
#調(diào)整熱圖尺寸,與原圖保持一致川队,resize()
heatmap_test = cv2.resize(heatmap, (test.shape[1], test.shape[0]))
#可視化熱力圖
plt.matshow(heatmap_test)
plt.show()
#將heatmap數(shù)組轉(zhuǎn)換為(0,255)之間的無符號的unit8數(shù)值
heatmap_test = np.uint8(255 * heatmap_test)
#將熱力圖轉(zhuǎn)換為噴射效果
heatmap_test = cv2.applyColorMap(heatmap_test, cv2.COLORMAP_JET)
#將熱力圖與原始圖像疊加力细, 0.5表示渲染強(qiáng)度, 有超出(0,255)范圍的,如果需要可視化固额,則需要clip裁剪
superimposed_img_test = heatmap_test * 0.5 + test
superimposed_img_test=np.clip(superimposed_img_test,0,255)
print(np.max(superimposed_img_test),superimposed_img_test.shape)
superimposed_img_test=superimposed_img_test.astype(np.uint8) ##必須做眠蚂,要不然會白屏
#用OpenCV中imread輸入照片后是一個數(shù)組對象,在進(jìn)行一系列的對數(shù)組操作后數(shù)組已經(jīng)變成了float類型斗躏,之后再對數(shù)組進(jìn)行imshow時即出現(xiàn)上面的第二種情況逝慧。倘若圖像矩陣(double型)的矩陣元素不在0-1之間,那么imshow會把超過1的元素都顯示為白色啄糙,即255笛臣。其實也好理解,因為double的矩陣并不是歸一化后的矩陣并不能保證元素范圍一定就在0-1之間隧饼,所以就會出錯沈堡。
cv2.imshow('1',superimposed_img_test)
cv2.waitKey(0)
cv2.imwrite('a.jpg',superimposed_img_test)#寫