在開始之前先羅列一些概念(如果哪里有誤請在評論中指出):
讓我們先考慮一種情況叼风,假如我們現(xiàn)在使用yolov2檢測數(shù)據(jù)集中的小貓,我們知道這些數(shù)據(jù)集是打了標(biāo)簽的拆吆,即它是有g(shù)round truth的存和,正常情況下姐帚,預(yù)測結(jié)果會包含小貓的置信度和位置 ,將實例分成正類(positive)或負(fù)類(negative)[根據(jù)IOU,與置信度無關(guān)铺根,置信度用于ap計算宪躯,不用于判斷實例是TP還是什么],可以參考下圖1
- 取左上角的tp來說位迂,其中的p是分類器認(rèn)為的樣本分類結(jié)果访雪,本例中認(rèn)為是該實例是正確的详瑞,接著我們拿著這個結(jié)果和ground truth標(biāo)簽作對比,認(rèn)為上述分類是正確的臣缀,即為t坝橡,如果認(rèn)為是錯誤的,則為f肝陪,實例結(jié)果就變成f了驳庭,樣本分類結(jié)果就是fp。tn和fn以此類推氯窍。
- 另一種解釋:
- TP:檢測到的正確樣例饲常;它的IOU數(shù)值大于閾值
- FP:檢測到的錯誤樣例;它的IOU數(shù)值小于閾值
- FN:沒檢測到的正確樣例
-
TN:正確的樣例但不在ground truth里的,比如一張圖你識別出來了3只貓(假設(shè)這三只貓全部識別正確)狼讨,但是打標(biāo)簽的時候只打了2只貓贝淤,那另外那只貓就叫做TN,這種樣本是不計入AP計算的
PS:這里的閾值每種數(shù)據(jù)集要求不一樣政供,voc2007要求IOU閾值為50%播聪,COCO則要求在5%到95%范圍內(nèi)。
本文主要參考了該github項目:點我布隔,如果想深入了解上述名詞的含義离陶,可以自己點擊閱讀。
Precison:
TP/(TP+FP)-------------->TP/(all detected things)
Recall:
TP/(TP+FN)-------------->TP/(ground truth中所有的正確實例)
AP:
這里主要講一下上述github項目中有點難懂的地方:
-
項目中提到的voc數(shù)據(jù)集ap的兩種計算方法分別是11點插值法和全局點插值衅檀,2010年以后都用全局點插值方法招刨。其中關(guān)于11點插值法有點難懂,這里提一下:
我們假定檢測了5張圖片哀军,其中g(shù)roundtruth實例數(shù)目是15沉眶,實際檢測出的實例數(shù)目是24,下圖是24個實例列表杉适,
根據(jù)置信度排序的實例列表
PR圖每個插值點取值計算方法:取大于等于該插值點的所有recall值中的最大precision藕咏。例如,0.0插值點具垫,其precision值為1.0侈离;0.1插值點,其precision值為0.666筝蚕;0.2插值點卦碾,為0.4285铺坞。而到了0.5插值點及以后,precision值均為0了洲胖。所以AP的數(shù)值即為:
image.pngimage.pngimage.png
mAP:
上面計算的都是單類的AP數(shù)值济榨,那么如果目標(biāo)檢測任務(wù)中除了檢測貓,還要檢測狗绿映,雞擒滑,鴨怎么辦呢,這時候就用到mAP了叉弦,計算方法為其中的4指的是實例種數(shù)丐一。
項目實踐(計算mAP和PR曲線)
本文的實踐代碼均基于上述github代碼,針對yolov3網(wǎng)絡(luò)做了些修改淹冰。
- 首先在測試集上測試得到識別結(jié)果库车,yolov3會將識別結(jié)果存入txt文件,具體命令請參考上篇文章樱拴。
- 計算代碼為:
###########################################################################################
# #
# This sample shows how to evaluate object detections applying the following metrics: #
# * Precision x Recall curve ----> used by VOC PASCAL 2012 #
# * Average Precision (AP) ----> used by VOC PASCAL 2012 #
# #
# Developed by: Rafael Padilla (rafael.padilla@smt.ufrj.br) #
# SMT - Signal Multimedia and Telecommunications Lab #
# COPPE - Universidade Federal do Rio de Janeiro #
# Last modification: May 24th 2018 #
###########################################################################################
import _init_paths
from BoundingBox import BoundingBox
from BoundingBoxes import BoundingBoxes
from Evaluator import *
from utils import *
dt_path='/home/longmao/workspace/compute MAP/Object-Detection-Metrics/' \
'samples/yolov3_compute_mAP/carplate.txt'
gt_path='/home/longmao/darknet/VOCdevkit/VOC2007/ImageSets/Main/test.txt'
def getBoundingBoxes(dt_path,gt_path):
"""Read txt files containing bounding boxes (ground truth and detections)."""
allBoundingBoxes = BoundingBoxes()
import glob
import os
# Read ground truths
# Class representing bounding boxes (ground truths and detections)
allBoundingBoxes = BoundingBoxes()
# Read GT detections from txt file
# Each line of the files in the groundtruths folder represents a ground truth bounding box
# (bounding boxes that a detector should detect)
# Each value of each line is "class_id, x, y, width, height" respectively
# Class_id represents the class of the bounding box
# x, y represents the most top-left coordinates of the bounding box
# x2, y2 represents the most bottom-right coordinates of the bounding box
label_path='/home/longmao/darknet/VOCdevkit/VOC2007/labels'
with open(gt_path,'r') as file_para:
files=file_para.readlines()
for f in files:
f=f.strip()
idClass = os.path.splitext(os.path.basename(dt_path))[0]
nameOfImage=f
with open(os.path.join(label_path,f)+'.txt','r') as a:
b=a.readlines()
for c in b:
c=c.strip()
splitLine=c.split()
x = float(splitLine[1]) # confidence
y = float(splitLine[2])
w = float(splitLine[3])
h = float(splitLine[4])
bb = BoundingBox(
nameOfImage,
idClass,
x,
y,
w,
h,
CoordinatesType.Relative,
imgSize=(1920,1080),
bbType=BBType.GroundTruth,
format=BBFormat.XYWH)
allBoundingBoxes.addBoundingBox(bb)
# Read detections
# Read detections from txt file
# Each line of the files in the detections folder represents a detected bounding box.
# Each value of each line is "class_id, confidence, x, y, width, height" respectively
# Class_id represents the class of the detected bounding box
# Confidence represents confidence (from 0 to 1) that this detection belongs to the class_id.
# x, y represents the most top-left coordinates of the bounding box
# x2, y2 represents the most bottom-right coordinates of the bounding box
with open(dt_path,'r') as files_para:
files=files_para.readlines()
idClass=os.path.splitext(os.path.basename(dt_path))[0]
for f in files:
f=f.strip()
print(f)
splitLine = f.split(" ")
nameOfImage = splitLine[0] # class
confidence = float(splitLine[1]) # confidence
x = float(splitLine[2])
y = float(splitLine[3])
w = float(splitLine[4])
h = float(splitLine[5])
print(idClass,nameOfImage,x,y,w,h)
bb = BoundingBox(
nameOfImage,
idClass,
x,
y,
w,
h,
CoordinatesType.Absolute, (1920, 1080),
BBType.Detected,
confidence,
format=BBFormat.XYX2Y2)
allBoundingBoxes.addBoundingBox(bb)
print(type(allBoundingBoxes))
return allBoundingBoxes
# getBoundingBoxes(dt_path,gt_path=gt_path)
def createImages(dictGroundTruth, dictDetected):
"""Create representative images with bounding boxes."""
import numpy as np
import cv2
# Define image size
width = 200
height = 200
# Loop through the dictionary with ground truth detections
for key in dictGroundTruth:
image = np.zeros((height, width, 3), np.uint8)
gt_boundingboxes = dictGroundTruth[key]
image = gt_boundingboxes.drawAllBoundingBoxes(image)
detection_boundingboxes = dictDetected[key]
image = detection_boundingboxes.drawAllBoundingBoxes(image)
# Show detection and its GT
cv2.imshow(key, image)
cv2.waitKey()
# Read txt files containing bounding boxes (ground truth and detections)
boundingboxes = getBoundingBoxes(dt_path,gt_path)
# Uncomment the line below to generate images based on the bounding boxes
# createImages(dictGroundTruth, dictDetected)
# Create an evaluator object in order to obtain the metrics
evaluator = Evaluator()
##############################################################
# VOC PASCAL Metrics
##############################################################
# Plot Precision x Recall curve
evaluator.PlotPrecisionRecallCurve(
boundingboxes, # Object containing all bounding boxes (ground truths and detections)
IOUThreshold=0.3, # IOU threshold
method=MethodAveragePrecision.EveryPointInterpolation, # As the official matlab code
showAP=True, # Show Average Precision in the title of the plot
showInterpolatedPrecision=True) # Plot the interpolated precision curve
# Get metrics with PASCAL VOC metrics
metricsPerClass = evaluator.GetPascalVOCMetrics(
boundingboxes, # Object containing all bounding boxes (ground truths and detections)
IOUThreshold=0.3, # IOU threshold
method=MethodAveragePrecision.EveryPointInterpolation) # As the official matlab code
print("Average precision values per class:\n")
# Loop through classes to obtain their metrics
for mc in metricsPerClass:
# Get metric values per each class
c = mc['class']
precision = mc['precision']
recall = mc['recall']
average_precision = mc['AP']
ipre = mc['interpolated precision']
irec = mc['interpolated recall']
# Print AP per class
print('%s: %f' % (c, average_precision))
- 不同的網(wǎng)絡(luò)測試所得結(jié)果可能不同柠衍,可能是每張圖片生成一個txt識別結(jié)果文件,但本文默認(rèn)將所有識別結(jié)果存入一個txt文件中晶乔,這里會有一個情況珍坊,如果該張圖片一個識別結(jié)果都沒有時,我們就跳過這張圖片正罢,不將該圖片的識別結(jié)果放入txt文件中即可阵漏。