隨著計(jì)算機(jī)技術(shù)的飛速發(fā)展以及視頻采集設(shè)備的大規(guī)模應(yīng)用鸟款,對于計(jì)算機(jī)視覺的研究也受到越來越多的關(guān)注。目標(biāo)跟蹤作為計(jì)算機(jī)視覺領(lǐng)域中的關(guān)鍵技術(shù)之一削樊,被廣泛地應(yīng)用于軍事制導(dǎo)割捅,視覺導(dǎo)航,機(jī)器人倡鲸,智能交通,公共安全等領(lǐng)域黄娘。在本文中我使用了三種算法綜合實(shí)現(xiàn)了對多車輛的跟蹤任務(wù)峭状。
Github:https://github.com/xiaochus/Vehicle_Tracking
PS:文章中代碼可能不夠完全,完整的代碼結(jié)構(gòu)參考github給出的文件逼争。
算法
- 目標(biāo)檢測: MOG2
- 目標(biāo)跟蹤: KCF
- 物體分類: CNN
環(huán)境
- Python 3.6
- OpenCV 3.2 + contrib
- Tensorflow-gpu 1.0
- Keras 1.2
目標(biāo)檢測
我們使用MOG2進(jìn)行當(dāng)前幀的目標(biāo)檢測任務(wù)优床。MOG2是一種背景減除算法,在之前的文章中進(jìn)行了介紹氮凝。目標(biāo)檢測的代碼如下所示:
import sys
import copy
import argparse
import cv2
import numpy as np
from keras.models import load_model
from utils.entity import Entity
camera = cv2.VideoCapture(video)
res, frame = camera.read()
y_size = frame.shape[0]
x_size = frame.shape[1]
# 導(dǎo)入CNN分類模型
model = load_model('model//weights.h5')
bs = cv2.createBackgroundSubtractorMOG2(detectShadows=True) # 定義MOG2
history = 20 # MOG2訓(xùn)練使用的幀數(shù)
frames = 0 # 當(dāng)前幀數(shù)
counter = 0 # 當(dāng)前目標(biāo)id
cv2.namedWindow("detection", cv2.WINDOW_NORMAL)
while True:
res, frame = camera.read()
if not res:
break
# 使用前20幀訓(xùn)練MOG2
fg_mask = bs.apply(frame)
if frames < history:
frames += 1
continue
# 對幀圖像進(jìn)行膨脹與去噪聲操作
th = cv2.threshold(fg_mask.copy(), 244, 255, cv2.THRESH_BINARY)[1]
th = cv2.erode(th, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3)), iterations=2)
dilated = cv2.dilate(th, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (8, 3)), iterations=2)
# 獲得目標(biāo)位置
image, contours, hier = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
在遍歷檢測框時(shí)羔巢,我們使用面積3000作為閾值對檢測框進(jìn)行過濾望忆,去掉過小的檢測框罩阵,以減輕分類模型運(yùn)行的次數(shù)。
在提取目標(biāo)之后启摄,我們先對其進(jìn)行重置大小稿壁,去均值等操作,然后將其送入CNN模型歉备,判斷是否為車輛傅是。如果當(dāng)前目標(biāo)為車輛,那么就與跟蹤列表中的對象進(jìn)行對比。這里我們對比兩者的IOU喧笔,即重疊度帽驯,只有當(dāng)前目標(biāo)與列表中所有的目標(biāo)的重疊度都很小時(shí),才會將當(dāng)前目標(biāo)加入跟蹤列表书闸。
for c in contours:
x, y, w, h = cv2.boundingRect(c)
if cv2.contourArea(c) > 3000:
# 提取目標(biāo)
img = frame[y: y + h, x: x + w, :]
rimg = cv2.resize(img, (64, 64), interpolation=cv2.INTER_CUBIC)
image_data = np.array(rimg, dtype='float32')
image_data /= 255.
roi = np.expand_dims(image_data, axis=0)
# 分類
flag = model.predict(roi)
if flag[0][0] > 0.5:
e = Entity(counter, (x, y, w, h), frame)
# 排除重復(fù)目標(biāo)
if track_list:
count = 0
num = len(track_list)
for p in track_list:
if overlap((x, y, w, h), p.windows) < iou:
count += 1
if count == num:
track_list.append(e)
else:
track_list.append(e)
counter += 1
矩形框重疊度函數(shù):
def overlap(box1, box2):
"""
檢查兩個(gè)矩形框的重疊程度.
"""
endx = max(box1[0] + box1[2], box2[0] + box2[2])
startx = min(box1[0], box2[0])
width = box1[2] + box2[2] - (endx - startx)
endy = max(box1[1] + box1[3], box2[1] + box2[3])
starty = min(box1[1], box2[1])
height = box1[3] + box2[3] - (endy - starty)
if (width <= 0 or height <= 0):
return 0
else:
Area = width * height
Area1 = box1[2] * box1[3]
Area2 = box2[2] * box2[3]
ratio = Area / (Area1 + Area2 - Area)
return ratio
在處理完新的物體后尼变,我們對跟蹤列表中的對象進(jìn)行處理。如果對象的中心接近幀邊界浆劲,那么就將其從列表中移除嫌术。如果沒有,那么就對該對象進(jìn)行更新操作牌借,激活其跟蹤器度气。
if track_list:
tlist = copy.copy(track_list)
for e in tlist:
x, y = e.center
if 10 < x < x_size - 10 and 10 < y < y_size - 10:
e.update(frame)
else:
track_list.remove(e)
目標(biāo)跟蹤
KCF(kernelized correlation filters),全稱核相關(guān)濾波膨报,是一種鑒別式追蹤方法磷籍。這類方法一般都是在追蹤過程中訓(xùn)練一個(gè)目標(biāo)檢測器,使用目標(biāo)檢測器去檢測下一幀預(yù)測位置是否是目標(biāo)丙躏,然后再使用新檢測結(jié)果去更新訓(xùn)練集進(jìn)而更新目標(biāo)檢測器择示。而在訓(xùn)練目標(biāo)檢測器時(shí)一般選取目標(biāo)區(qū)域?yàn)檎龢颖荆繕?biāo)的周圍區(qū)域?yàn)樨?fù)樣本晒旅,當(dāng)然越靠近目標(biāo)的區(qū)域?yàn)檎龢颖镜目赡苄栽酱蟆?/p>
我們定義了一個(gè)實(shí)體類栅盲,該類為每一個(gè)被檢測到的物體實(shí)例化一個(gè)對象。對象實(shí)例化時(shí)會初始化一個(gè)KCF跟蹤器废恋。KCF跟蹤器接受一個(gè)幀與目標(biāo)的坐標(biāo)位置谈秫。通過update()函數(shù)載入最新的幀,KCF跟蹤器能夠計(jì)算目標(biāo)在當(dāng)前幀中所處的位置鱼鼓。
# coding:utf8
import cv2
import numpy as np
class Entity(object):
def __init__(self, vid, windows, frame):
self.vid = vid
self.windows = windows
self.center = self._set_center(windows)
self.trajectory = [self.center]
self.tracker = self._init_tracker(windows, frame)
def _set_center(self, windows):
x, y, w, h = windows
x = (2 * x + w) / 2
y = (2 * y + h) / 2
center = np.array([np.float32(x), np.float32(y)], np.float32)
return center
def _init_tracker(self, windows, frame):
"""
初始化KCF跟蹤器
"""
x, y, w, h = windows
tracker = cv2.Tracker_create('KCF')
tracker.init(frame, (x, y, w, h))
return tracker
def update(self, frame):
"""
更新目標(biāo)位置
"""
self.tracker.update(frame)
ok, new_box = self.tracker.update(frame)
if ok:
x, y, w, h = int(new_box[0]), int(new_box[1]), int(new_box[2]), int(new_box[3])
self.center = self._set_center((x, y, w, h))
self.windows = (x, y, w, h)
self.trajectory.append(self.center)
cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 1)
cv2.putText(frame, "vehicle", (x, y - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 1, cv2.LINE_AA)
cv2.polylines(frame, [np.int32(self.trajectory)], 0, (0, 0, 255))
分類數(shù)據(jù)
我們使用MIT的車輛與行人數(shù)據(jù)訓(xùn)練分類模型拟烫,數(shù)據(jù)為ppm格式,大小分別為128X128與128X64迄本。為了使用數(shù)據(jù)統(tǒng)一并且符合CNN模型的輸入硕淑,我們對其進(jìn)行了處理操作,統(tǒng)一了大小嘉赎。
# coding: utf8
import os
import cv2
def main():
path1 = 'cars128x128//'
path2 = 'pedestrians128x64//'
path3 = 'data//train//cars//'
path4 = 'data//train//pedestrians//'
for root, dirs, files in os.walk(path1):
for f in files:
n = f.split('.')[0]
img = path1 + f
image = cv2.imread(img)
resized_image = cv2.resize(image, (64, 64), interpolation=cv2.INTER_CUBIC)
cv2.imwrite(path3 + str(n) + '.jpg', resized_image)
for root, dirs, files in os.walk(path2):
for f in files:
n = f.split('.')[0]
img = path2 + f
image = cv2.imread(img)
resized_image = cv2.resize(image, (64, 64), interpolation=cv2.INTER_CUBIC)
cv2.imwrite(path4 + str(n) + '.jpg', resized_image)
if __name__ == '__main__':
main()
點(diǎn)擊下載原始數(shù)據(jù)與處理好的數(shù)據(jù)置媳。
CNN分類模型
我們使用了3層的CNN網(wǎng)絡(luò)作為分類模型,每個(gè)卷積單元由卷積層公条、BN層拇囊、LeakyRelu與池化層組成。如下圖:
CNN網(wǎng)絡(luò)使用Keras進(jìn)行定義靶橱,如下:
# coding: utf8
from keras.models import Sequential
from keras.regularizers import l2
from keras.layers import Convolution2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras.layers.normalization import BatchNormalization
from keras.layers.advanced_activations import LeakyReLU
def cnn_net(size):
model = Sequential()
model.add(Convolution2D(16, 5, 5, W_regularizer=l2(5e-4), border_mode='same', input_shape=(size, size, 3)))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.1))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(32, 5, 5, W_regularizer=l2(5e-4), border_mode='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.1))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(64, 3, 3, W_regularizer=l2(5e-4), border_mode='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.1))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='Adam', metrics=['accuracy'])
return model
數(shù)據(jù)增強(qiáng)與模型訓(xùn)練
Keras提供了數(shù)據(jù)增強(qiáng)的功能寥袭,能夠增加數(shù)據(jù)量并且防止過擬合路捧。使用ImageDataGenerator能夠?qū)⑽募A中的原始數(shù)據(jù)轉(zhuǎn)化為生成器,供模型使用传黄。
import os
import sys
import argparse
import pandas as pd
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import EarlyStopping
from keras.utils.visualize_util import plot
def data_process(size, batch_size_train, batch_size_val):
path = os.path.abspath(os.path.join(os.path.dirname("__file__"), os.path.pardir))
datagen1 = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
rotation_range=90,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True)
datagen2 = ImageDataGenerator(rescale=1. / 255)
train_generator = datagen1.flow_from_directory(
path + '//data//train',
target_size=(size, size),
batch_size=batch_size_train,
class_mode='binary')
validation_generator = datagen2.flow_from_directory(
path + '//data//validation',
target_size=(size, size),
batch_size=batch_size_val,
class_mode='binary')
return train_generator, validation_generator
模型訓(xùn)練函數(shù)如下所示杰扫,我們在訓(xùn)練的過程中引入了EarlyStopping,保證模型在準(zhǔn)確度不再上升的時(shí)候自動結(jié)束訓(xùn)練膘掰。
def train(model, epochs, batch_size_train, batch_size_val, size):
train_generator, validation_generator = data_process(size, batch_size_train, batch_size_val)
earlyStopping = EarlyStopping(monitor='val_loss', patience=50, verbose=1, mode='auto')
hist = model.fit_generator(
train_generator,
nb_epoch=epochs,
samples_per_epoch=batch_size_train,
validation_data=validation_generator,
nb_val_samples=batch_size_val,
callbacks=[earlyStopping])
df = pd.DataFrame.from_dict(hist.history)
df.to_csv('hist.csv', encoding='utf-8', index=False)
model.save('weights.h5')
結(jié)果
運(yùn)行下列命令涉波,即可對video文件夾中保存的文件進(jìn)行處理。
python track.py --file "car.flv"
效果如下圖所示: