1: 工程模塊: 識別和計(jì)數(shù)
識別模塊是用darkflow(darkflow = darknet + tensorflow)
darknet主要是YOLO模塊
YOLO官網(wǎng) 用于做偵測和識別人物枪芒,darknet是c語言赋铝,darkflow是python恩静。
計(jì)數(shù)主要是deep_sort/sort模塊:
其實(shí)這兩個模塊的區(qū)別,作者寫的很清楚了
deep_sort is build upon sort , it uses deep encoders to build features for each detected box and match it with it's corresponding box in next frame.
意思是說:deep_sort使用deep encoders去創(chuàng)建特征琅关,將每個偵測的box做幀間對比挑辆。
2:代碼分析
run.py
FLAGS = argHandler()
FLAGS.setDefaults()
# 輸入文件瘟判,如果是攝像頭就用camera
FLAGS.demo = "test.avi" # video file to use, or if camera just put "camera"
# yolo的配置文件炕吸,里面包含網(wǎng)絡(luò)模型,如果要識別快,可以選擇tiny-yolo.cfg
FLAGS.model = "darkflow/cfg/yolo.cfg" # tensorflow model
# yolo訓(xùn)練后的權(quán)重文件
FLAGS.load = "darkflow/bin/yolo.weights" # tensorflow weights
# 顧名思義旧噪,識別物體的閥值,YOLO中定義的
FLAGS.threshold = 0.25 # threshold of decetion confidance (detection if confidance > threshold )
# 是否選用gpu
FLAGS.gpu = 0 #how much of the GPU to use (between 0 and 1) 0 means use cpu
# 偵測識別
FLAGS.track = True # wheither to activate tracking or not
# 偵測識別的對象
FLAGS.trackObj = "person" # the object to be tracked
# 是否保存成視頻
FLAGS.saveVideo = True #whether to save the video or not
# 是否啟用MOG算法脓匿,啟用的話淘钟,可以根據(jù)背景的差異做判斷,一般情況在像素低的情況開啟
FLAGS.BK_MOG = False # activate background substraction using cv2 MOG substraction,
#to help in worst case scenarion when YOLO cannor predict(able to detect mouvement, it's not ideal but well)
# helps only when number of detection < 5, as it is still better than no detection.
# 深度計(jì)數(shù)模式
FLAGS.tracker = "deep_sort" # wich algorithm to use for tracking deep_sort/sort (NOTE : deep_sort only trained for people detection )
# 丟幀數(shù)
FLAGS.skip = 3 # how many frames to skipp between each detection to speed up the network
# 保存csv文件
FLAGS.csv = True # whether to write csv file or not(only when tracking is set to True)
# 可視化顯示box
FLAGS.display = True # display the tracking or not
tfnet = TFNet(FLAGS)
# 開始識別分析計(jì)數(shù)
tfnet.camera()
exit('Demo stopped, exit.')
運(yùn)行這個文件陪毡,就可以開始工作了米母。
如果視頻處理完畢,會生成一個csv文件毡琉,如圖:
其中铁瞒,第一列是幀的序號,
第二列绊起,也就是紅色框精拟,是對應(yīng)出現(xiàn)的person的ID編號;
基于源碼基礎(chǔ)上,我另外寫了個GetCsvColumn.py蜂绎,用于提取這里的值并去重栅表,并顯示在左上角:gong count:
后四列是x y w h
如何將統(tǒng)計(jì)的數(shù)量顯示在UI上
本人python用的不熟,各位湊合著看师枣;
修改predict.py
if self.FLAGS.display:
cv2.rectangle(imgcv, (int(bbox[0]), int(bbox[1])), (int(bbox[2]), int(bbox[3])),
(0,255,0), thick//3)
#cv2.putText(imgcv, id_num,(int(bbox[0]), int(bbox[1]) - 12),0, 1e-3 * h, (255,0,255),thick//6)
# show person id
cv2.putText(imgcv, str(update_csv(int(id_num))), (int(bbox[0]), int(bbox[1]) - 12), 0, 1e-3 * h, (255, 0, 255), thick // 6)
# set font
font = cv2.FONT_HERSHEY_TRIPLEX
# count the person
mycount = update_csv(0)
# show to UI, 0,0,255 is red
cv2.putText(imgcv, 'gong count: '+str(mycount), (10,70),0, 1e-3 * h, (0,0,255),thick//6)
定義一個update_csv方法怪瓶,得到當(dāng)前此人的ID和總的人數(shù)
def update_csv(count):
with open(csvfilename, 'rb') as csvfile:
reader = csv.DictReader(csvfile)
column = [row['track_id'] for row in reader]
blist = list(set(column))
clist = sorted([int(i) for i in blist])
if count == 0:
return len(clist)
else:
return clist.index(count) + 1
這樣,工程就可以跑起來并將結(jié)果顯示出來了践美;
個人認(rèn)為洗贰,darkflow主要是將YOLO的C代碼改成了python代碼
darkflow模塊,可以重點(diǎn)看下
這里面可以優(yōu)化的地方不多,因?yàn)槟P鸵呀?jīng)訓(xùn)練好陨倡,不太可能再重新訓(xùn)練模型敛滋,而從生成的視頻效果看,行人的識別度其實(shí)還不錯兴革,其準(zhǔn)確率不高是由于別的因素(比如 像素绎晃,重疊)
image.png
deep_sort模塊
里面有個resources文件夾杂曲,是訓(xùn)練好的網(wǎng)絡(luò)權(quán)重文件
這個可以下載庶艾,第三方已經(jīng)訓(xùn)練好了的
里面一些地方我沒細(xì)看,希望大家看完分享下擎勘。
前面幾個不用看咱揍,其中kalman_filter用的是opencv的Kalman filter
參考1
參考2
后面幾個可以看下有沒有優(yōu)化的空間
比如:nn_matching.py用到的distance方法
def distance(self, features, targets):
"""Compute distance between features and targets.
Parameters
----------
features : ndarray
An NxM matrix of N features of dimensionality M.
targets : List[int]
A list of targets to match the given `features` against.
Returns
-------
ndarray
Returns a cost matrix of shape len(targets), len(features), where
element (i, j) contains the closest squared distance between
`targets[i]` and `features[j]`.
"""
cost_matrix = np.zeros((len(targets), len(features)))
for i, target in enumerate(targets):
cost_matrix[i, :] = self._metric(self.samples[target], features)
return cost_matrix
3: 一些優(yōu)化的思路
目前存在的問題主要是:準(zhǔn)確率的問題
- 兩個人并行走的時候,容易只識別成一個人
這個是yolo的缺點(diǎn)棚饵,暫時無解 - 一個人在視頻里的角度如果發(fā)生變化煤裙,比如轉(zhuǎn)身或者側(cè)面
會識別成多個人(id會增長)
附錄:MAC上安裝OBS:
在MAC OS上,目前無法安裝官網(wǎng)下載的OBS的dmg的版本蟹地,需要手動編譯安裝积暖;
官方wiki
git clone完obs-studio后藤为,需要先執(zhí)行:
git submodule update --init --recursive