結(jié)構(gòu)化學習
簡介
什么樣問題是結(jié)構(gòu)化學習隔箍,輸入和輸出都是結(jié)構(gòu)化數(shù)據(jù),所謂結(jié)構(gòu)化數(shù)據(jù)可以是以下這些數(shù)據(jù)結(jié)構(gòu)
- 序列
- 列表
- 樹結(jié)構(gòu)
- 定界框(bounding Box)
在我們之前學習機器學習中輸入與輸出往往不是同的結(jié)構(gòu)梅尤,例如在CNN中我們輸入一張圖柜思,輸出分類的概率。而在結(jié)構(gòu)化學習輸入和輸出往往是同樣數(shù)據(jù)結(jié)構(gòu)巷燥。
結(jié)構(gòu)化學習應(yīng)用
- 語音辨識(Speech Recognition)
- 機器翻譯(Translation)
- 目標檢測(object Detection)
- 語法分析(Syntactic Paring)
- 文本總結(jié)(Summarization)
- 文本檢索(Retrieval)
模型結(jié)構(gòu)
表示輸入 X Y 這里用大寫表示一個結(jié)構(gòu)化對象赡盘,例如矩陣
輸入 x 后會遍歷所有Y 中的 y 來最大值
目標檢測
在目標檢測中,我們輸入一張圖片缰揪,輸出BoundingBox 邊框矩形的(左上角點陨享,和右下角點)
import cv2 as cv
import numpy as np
import matplotlib.pyplot as plt
import glob
%matplotlib inline
img = cv.imread("images/naruto_bouding_box.jpg")
img = cv.cvtColor(img,cv.COLOR_BGR2RGB)
print(img.shape)
cv.rectangle(img,(100,80),(350,300),(0,255,0),10)
plt.imshow(img)
(393, 699, 3)
<matplotlib.image.AxesImage at 0x129d9b850>
在這個火影識別任務(wù)中,輸入是圖片 img 和 bouding box((100,80),(350,300))
img = cv.imread("images/naruto_bouding_box.jpg")
img = cv.cvtColor(img,cv.COLOR_BGR2RGB)
print(img.shape)
cv.rectangle(img,(530,150),(580,220),(0,0,255),10)
plt.imshow(img)
(393, 699, 3)
<matplotlib.image.AxesImage at 0x12a34bb90>
img = cv.imread("images/naruto_bouding_box.jpg")
img = cv.cvtColor(img,cv.COLOR_BGR2RGB)
print(img.shape)
roi_img = img[80:300,100:350,:]
print(roi_img.shape)
plt.imshow(roi_img)
(393, 699, 3)
(220, 250, 3)
<matplotlib.image.AxesImage at 0x12c7e50d0>
roi_img_hist = roi_img.reshape((220* 250 *3,1))
print(roi_img_hist.shape)
roi_img_n, bins, patches = plt.hist(roi_img_hist, 10, facecolor='blue', alpha=0.5)
print(roi_img_n)
plt.show()
(165000, 1)
[13130. 7950. 8218. 10143. 11940. 18027. 25068. 37002. 10166. 23356.]
這個分數(shù)就會很低邀跃,因為bouding box (藍色矩形)位置不正確
bounding_boxes = np.array([
[100,80,350,300],
[530,150,580,220]
])
for rect in bounding_boxes:
print(type(rect))
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
img_001 = cv.imread("images/naruto_001.jpeg")
img_001 = cv.cvtColor(img_001,cv.COLOR_BGR2RGB)
print(img.shape)
# roi_img = img[80:300,100:350,:]
# print(roi_img.shape)
plt.imshow(img_001)
(436, 474, 3)
<matplotlib.image.AxesImage at 0x12d59bb10>
y_001 = np.array([
[50,250,160,300],
[100,150,100,150],
[300,400,120,200]
])
roi_img_001_dict = {}
for idx in range(len(y_001)):
print(idx)
print(y_001[idx])
plt.subplot(1,len(y_001),(idx+1))
roi_img_ = img_001[y_001[idx][0]:y_001[idx][1],y_001[idx][2]:y_001[idx][3],:]
roi_img_001_dict[idx] = roi_img_.reshape(((y_001[idx][1] - y_001[idx][0])
*(y_001[idx][3] - y_001[idx][2])*3))
# roi_img.reshape((220* 250 *3,1))
# print(roi_img_)
plt.imshow(roi_img_)
# print(roi_img.shape)
# plt.imshow(roi_img_001)
0
[ 50 250 160 300]
1
[100 150 100 150]
2
[300 400 120 200]
# print(roi_img_001_list)
print(type(roi_img_001_dict))
plt.figure(figsize=(16,5))
for key in roi_img_001_list.keys():
plt.subplot(1,3,(key+1))
roi_img_001_n, bins, patches = plt.hist(roi_img_001_dict[key], 10, facecolor='blue', alpha=0.5)
<class 'dict'>
def generate_img_data(path):
print(glob.glob(path + "/*.jpg"))
generate_img_data("./source")
['./source/naruto_bouding_box.jpg']
最后希望大家關(guān)注我們微信公眾號