吳恩達(dá)Deep Learning課程Autonomous driving - Car detection翻譯

2 - YOLO

2.1 Model Details 模型細(xì)節(jié)

First things to know:

The input is a batch of images of shape (m, 608, 608, 3)
The output is a list of bounding boxes along with the recognized classes. Each bounding box is represented by 6 numbers (pc,bx,by,bh,bw,c)(pc,bx,by,bh,bw,c) as explained above. If you expand c into an 80-dimensional vector, each bounding box is then represented by 85 numbers.

We will use 5 anchor boxes. So you can think of the YOLO architecture as the following: IMAGE (m, 608, 608, 3) -> DEEP CNN -> ENCODING (m, 19, 19, 5, 85).

輸入是一批數(shù)據(jù)(m,608,608,3)

m為樣本數(shù)量品嚣，圖像分辨率為608x608x3 , 3是通道數(shù)量，代表RGB

輸出是一列帶有分類標(biāo)志的向量以躯。每個(gè)邊界框向量由6個(gè)元素組成，如果你把參數(shù)C擴(kuò)展成80維的向量，那么邊界框向量就由85個(gè)數(shù)字元素組成迹炼。

6個(gè)元素分別是辨識(shí)對(duì)象的概率pc顾彰，對(duì)象中心點(diǎn)的橫、縱坐標(biāo)(bx增显、by)雁佳，對(duì)象邊界框的高、寬(bh同云、bw)糖权，還有類別代碼(c)。

我們使用5個(gè)目標(biāo)框炸站，所以YOLO結(jié)構(gòu)最終輸出(m, 19, 19, 5, 85)星澳。

Now, for each box (of each cell) we will compute the following elementwise product and extract a probability that the box contains a certain class.

Figure 4 : Find the class detected by each box

Here's one way to visualize what YOLO is predicting on an image:

簡單概括，用可能性pc乘以80個(gè)對(duì)象的標(biāo)識(shí)旱易，得到每個(gè)對(duì)象的分?jǐn)?shù)score禁偎，即為算法認(rèn)為此處是該對(duì)象的可能性腿堤。
其中用(bx,by,bh,bw)定位，用c的數(shù)值標(biāo)識(shí)對(duì)象類型如暖。

For each of the 19x19 grid cells, find the maximum of the probability scores (taking a max across both the 5 anchor boxes and across different classes).
Color that grid cell according to what object that grid cell considers the most likely.

Doing this results in this picture:

Figure 5 : Each of the 19x19 grid cells colored according to which class has the largest predicted probability in that cell.

Note that this visualization isn't a core part of the YOLO algorithm itself for making predictions; it's just a nice way of visualizing an intermediate result of the algorithm.
Another way to visualize YOLO's output is to plot the bounding boxes that it outputs. Doing that results in a visualization like this:

Figure 6 : Each cell gives you 5 boxes. In total, the model predicts: 19x19x5 = 1805 boxes just by looking once at the image (one forward pass through the network)! Different colors denote different classes.

In the figure above, we plotted only boxes that the model had assigned a high probability to, but this is still too many boxes. You'd like to filter the algorithm's output down to a much smaller number of detected objects. To do so, you'll use non-max suppression. Specifically, you'll carry out these steps:

Get rid of boxes with a low score (meaning, the box is not very confident about detecting a class)
Select only one box when several boxes overlap with each other and detect the same object.

簡單翻譯一下笆檀，有兩種辦法標(biāo)記出anchor box，

第一種盒至，對(duì)于19x19的每一個(gè)網(wǎng)格中的5個(gè)可能的對(duì)象酗洒，把得分最高的那個(gè)用顏色標(biāo)記出來。
第二種枷遂，把檢測到的每個(gè)對(duì)象邊界框都畫出來

對(duì)于第二種樱衷，我們雖然只標(biāo)記了可能性較大的對(duì)象，但是仍然還有很多框酒唉，所以我們繼續(xù)做以下工作：

放棄那些分?jǐn)?shù)低的標(biāo)記框
當(dāng)多個(gè)框重疊標(biāo)記同一個(gè)對(duì)象時(shí)矩桂，只選擇一個(gè)

2.2 - Filtering with a threshold on class scores 依據(jù)scores參數(shù)過濾

You are going to apply a first filter by thresholding. You would like to get rid of any box for which the class "score" is less than a chosen threshold.

The model gives you a total of 19x19x5x85 numbers, with each box described by 85 numbers. It'll be convenient to rearrange the (19,19,5,85) (or (19,19,425)) dimensional tensor into the following variables:

box_confidence: tensor of shape (19 x 19, 5, 1) containing pc (confidence probability that there's some object) for each of the 5 boxes predicted in each of the 19x19 cells.
boxes: tensor of shape (19 x 19, 5, 4) containing (b_x, b_y, b_h, b_w) for each of the 5 boxes per cell.
box_class_probs: tensor of shape (19 x 19, 5, 80) containing the detection probabilities (c_1, c_2, ... c_{80}) for each of the 80 classes for each of the 5 boxes per cell.

Exercise: Implement yolo_filter_boxes().

Compute box scores by doing the elementwise product as described in Figure 4. The following code may help you choose the right operator:

a = np.random.randn(19*19, 5, 1)
b = np.random.randn(19*19, 5, 80)
c = a * b # shape of c will be (19*19, 5, 80)

For each box, find:
- the index of the class with the maximum box score (Hint) (Be careful with what axis you choose; consider using axis=-1)
- the corresponding box score (Hint) (Be careful with what axis you choose; consider using axis=-1)
Create a mask by using a threshold. As a reminder: ([0.9, 0.3, 0.4, 0.5, 0.1] < 0.4) returns: [False, True, False, False, True]. The mask should be True for the boxes you want to keep.
Use TensorFlow to apply the mask to box_class_scores, boxes and box_classes to filter out the boxes we don't want. You should be left with just the subset of boxes you want to keep. (Hint)

Reminder: to call a Keras function, you should use K.function(...).
翻譯一下：

box_confidence :即為19*19的每個(gè)區(qū)域中，生成5個(gè)anchorbox痪伦，每個(gè)anchorbox生成一個(gè)pc
boxes ：即為19*19的每個(gè)區(qū)域中侄榴，生成5個(gè)anchorbox，每個(gè)anchorbox的邊界框
box_class_probs ：19195個(gè)anchorbox中流妻，每個(gè)box的邊界參數(shù)牲蜀，前面已經(jīng)解釋了，4個(gè)參數(shù)各代表什么意義

實(shí)現(xiàn)yolo_filter_boxes():

實(shí)現(xiàn)圖片4中的的運(yùn)算绅这，用乘法就行涣达，box_confidence * box_class_probs , 其中box_confidence不足的維度將自動(dòng)擴(kuò)充，所以運(yùn)算結(jié)果是（19x19x5x80）的向量证薇。
對(duì)于每個(gè)anchor box（19x19x5個(gè)）度苔，找出：
- 盒子中最大的分?jǐn)?shù)score的類別序號(hào)（即80個(gè)中找到最大的那個(gè)）
- 該類別對(duì)應(yīng)的分?jǐn)?shù)socre
創(chuàng)造一個(gè)掩碼，這個(gè)掩碼將你想保留的anchorbox設(shè)為true
使用TensorFlow對(duì)box_class_probs浑度，boxes和box_classes應(yīng)用掩碼寇窑，將我們不想要的boxes篩選掉，你應(yīng)當(dāng)留下你想留下的boxes子集箩张。
注意甩骏，想使用keras的函數(shù)，需要用k.function(...)

# GRADED FUNCTION: yolo_filter_boxes

def yolo_filter_boxes(box_confidence, boxes, box_class_probs, threshold = .6):
    """Filters YOLO boxes by thresholding on object and class confidence.
    
    Arguments:
    box_confidence -- tensor of shape (19, 19, 5, 1)
    boxes -- tensor of shape (19, 19, 5, 4)
    box_class_probs -- tensor of shape (19, 19, 5, 80)
    threshold -- real value, if [ highest class probability score < threshold], then get rid of the corresponding box
    
    Returns:
    scores -- tensor of shape (None,), containing the class probability score for selected boxes
    boxes -- tensor of shape (None, 4), containing (b_x, b_y, b_h, b_w) coordinates of selected boxes
    classes -- tensor of shape (None,), containing the index of the class detected by the selected boxes
    
    Note: "None" is here because you don't know the exact number of selected boxes, as it depends on the threshold. 
    For example, the actual output size of scores would be (10,) if there are 10 boxes.
    """
    
    # Step 1: Compute box scores
    ### START CODE HERE ### (≈ 1 line) 算出得分可能性
    box_scores = box_confidence * box_class_probs
    ### END CODE HERE ###
    
    # Step 2: Find the box_classes thanks to the max box_scores, keep track of the corresponding score
    ### START CODE HERE ### (≈ 2 lines)
    #獲得最高分?jǐn)?shù)的序號(hào) 19x19x5x1
    box_classes = K.argmax(box_scores, axis=-1)
    #獲得最高分?jǐn)?shù)的分?jǐn)?shù) 19x19x5x1
    box_class_scores = K.max(box_scores, axis=-1, keepdims=False)
    ### END CODE HERE ###
    
    # Step 3: Create a filtering mask based on "box_class_scores" by using "threshold". The mask should have the
    # same dimension as box_class_scores, and be True for the boxes you want to keep (with probability >= threshold)
    ### START CODE HERE ### (≈ 1 line)
    #將分?jǐn)?shù)大于輸入值threshold的標(biāo)記為true先慷，創(chuàng)造掩碼
    filtering_mask = box_class_scores >= threshold
    ### END CODE HERE ###
    
    # Step 4: Apply the mask to scores, boxes and classes
    ### START CODE HERE ### (≈ 3 lines) 獲得符合mask最高分?jǐn)?shù)饮笛，該分?jǐn)?shù)所屬對(duì)象的邊界框，該分?jǐn)?shù)所屬對(duì)象類別
    scores = tf.boolean_mask(box_class_scores, filtering_mask)
    boxes = tf.boolean_mask(boxes, filtering_mask)
    classes = tf.boolean_mask(box_classes, filtering_mask)
    ### END CODE HERE ###
    
    return scores, boxes, classes

2.3 - Non-max suppression 非極大值抑制

Even after filtering by thresholding over the classes scores, you still end up a lot of overlapping boxes. A second filter for selecting the right boxes is called non-maximum suppression (NMS).

Figure 7 : In this example, the model has predicted 3 cars, but it's actually 3 predictions of the same car. Running non-max suppression (NMS) will select only the most accurate (highest probabiliy) one of the 3 boxes.

Non-max suppression uses the very important function called "Intersection over Union", or IoU.

Figure 8 : Definition of "Intersection over Union".

Exercise: Implement iou(). Some hints:

In this exercise only, we define a box using its two corners (upper left and lower right): (x1, y1, x2, y2) rather than the midpoint and height/width.
To calculate the area of a rectangle you need to multiply its height (y2 - y1) by its width (x2 - x1).
You'll also need to find the coordinates (xi1, yi1, xi2, yi2) of the intersection of two boxes. Remember that:
- xi1 = maximum of the x1 coordinates of the two boxes
- yi1 = maximum of the y1 coordinates of the two boxes
- xi2 = minimum of the x2 coordinates of the two boxes
- yi2 = minimum of the y2 coordinates of the two boxes
In order to compute the intersection area, you need to make sure the height and width of the intersection are positive, otherwise the intersection area should be zero. Use max(height, 0) and max(width, 0).

In this code, we use the convention that (0,0) is the top-left corner of an image, (1,0) is the upper-right corner, and (1,1) the lower-right corner.

非極大值抑制這部分论熙，其實(shí)吳恩達(dá)老師在課程里講得很清楚了福青，我簡單翻譯一下：
即使經(jīng)過了用掩碼對(duì)類別得分進(jìn)行過濾，你仍然有許多重疊的邊界框（如圖七），下一個(gè)用來選擇正確邊界框的過濾器被稱作非極大值抑制（NMS）无午。
而非極大值抑制需要用到一個(gè)非常重要的函數(shù)媒役，交并比（IoU，Intersection over Union）宪迟，如圖8酣衷。
練習(xí)：實(shí)現(xiàn)iou()函數(shù)

僅在此練習(xí)中，我們用兩個(gè)頂點(diǎn)來定義邊界框(x1,y1,x2,y2)踩验，而不是中點(diǎn)和寬高鸥诽。
你需要用高(y2 - y1)乘以寬(x2 - x1)來計(jì)算矩形區(qū)域(的面積)商玫。

在這段代碼中箕憾，(0,0)是圖像的左上角坐標(biāo)，(1,1)是左下角坐標(biāo)拳昌。

你還需要找到兩個(gè)邊界框相交部分的交點(diǎn)(xi1, yi1, xi2, yi2)袭异。
- xi1 = 兩個(gè)邊界框x1坐標(biāo)(左上角坐標(biāo))的最大值
- yi1 = 兩個(gè)邊界框y1坐標(biāo)(左上角坐標(biāo))的最大值
- xi2 = 兩個(gè)邊界框的x2坐標(biāo)(右下角坐標(biāo))的最小值
- yi2 = 兩個(gè)邊界框的y2坐標(biāo)(右下角坐標(biāo))的最小值
為了計(jì)算香蕉區(qū)域，你得確保相交區(qū)域的寬和高為正值炬藤，否則相交區(qū)域就歸零御铃。用 max(height, 0) 和 max(width, 0)

最后編輯于：2018.05.13 11:41:28

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末，一起剝皮案震驚了整個(gè)濱河市沈矿，隨后出現(xiàn)的幾起案子，更是在濱河造成了極大的恐慌，老刑警劉巖攻礼，帶你破解...
沈念sama閱讀 218,607評(píng)論 6贊 507
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件肠套，死亡現(xiàn)場離奇詭異，居然都是意外死亡陵像，警方通過查閱死者的電腦和手機(jī)就珠，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 93,239評(píng)論 3贊 395
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門，熙熙樓的掌柜王于貴愁眉苦臉地迎上來醒颖，“玉大人妻怎，你說我怎么就攤上這事∨⑶福” “怎么了逼侦？”我有些...
開封第一講書人閱讀 164,960評(píng)論 0贊 355
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵，是天一觀的道長腰耙。經(jīng)常有香客問我榛丢，道長，這世上最難降的妖魔是什么沟优？我笑而不...
開封第一講書人閱讀 58,750評(píng)論 1贊 294
?港島之戀（遺憾婚禮）
正文為了忘掉前任涕滋，我火速辦了婚禮，結(jié)果婚禮上挠阁，老公的妹妹穿的比我還像新娘宾肺。我一直安慰自己溯饵，他們只是感情好，可當(dāng)我...
茶點(diǎn)故事閱讀 67,764評(píng)論 6贊 392
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布锨用。她就那樣靜靜地躺著丰刊，像睡著了一般。火紅的嫁衣襯著肌膚如雪增拥。梳的紋絲不亂的頭發(fā)上啄巧，一...
開封第一講書人閱讀 51,604評(píng)論 1贊 305
城市分裂傳說
那天，我揣著相機(jī)與錄音掌栅，去河邊找鬼秩仆。笑死，一個(gè)胖子當(dāng)著我的面吹牛猾封，可吹牛的內(nèi)容都是我干的澄耍。我是一名探鬼主播，決...
沈念sama閱讀 40,347評(píng)論 3贊 418
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼晌缘，長吁一口氣：“原來是場噩夢啊……” “哼齐莲！你這毒婦竟也來了？” 一聲冷哼從身側(cè)響起磷箕，我...
開封第一講書人閱讀 39,253評(píng)論 0贊 276
萬榮殺人案實(shí)錄
序言：老撾萬榮一對(duì)情侶失蹤选酗，失蹤者是張志新（化名）和其女友劉穎，沒想到半個(gè)月后岳枷，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體芒填，經(jīng)...
沈念sama閱讀 45,702評(píng)論 1贊 315
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 37,893評(píng)論 3贊 336
?白月光啟示錄
正文我和宋清朗相戀三年嫩舟，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了氢烘。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
茶點(diǎn)故事閱讀 40,015評(píng)論 1贊 348
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡家厌，死狀恐怖播玖，靈堂內(nèi)的尸體忽然破棺而出，到底是詐尸還是另有隱情饭于，我是刑警寧澤蜀踏，帶...
沈念sama閱讀 35,734評(píng)論 5贊 346
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站掰吕，受9級(jí)特大地震影響果覆，放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜殖熟，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 41,352評(píng)論 3贊 330
男人毒藥：我在死后第九天來索命
文/蒙蒙一局待、第九天我趴在偏房一處隱蔽的房頂上張望。院中可真熱鬧，春花似錦钳榨、人聲如沸舰罚。這莊子的主人今日做“春日...
開封第一講書人閱讀 31,934評(píng)論 0贊 22
一樁弒父案薛耻，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽营罢。三九已至，卻和暖如春饼齿，著一層夾襖步出監(jiān)牢的瞬間饲漾，已是汗流浹背。一陣腳步聲響...
開封第一講書人閱讀 33,052評(píng)論 1贊 270
情欲美人皮
我被黑心中介騙來泰國打工缕溉，沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留考传，地道東北人。一個(gè)月前我還...
沈念sama閱讀 48,216評(píng)論 3贊 371
代替公主和親
正文我出身青樓倒淫，卻偏偏與公主長得像伙菊，于是被迫代替她去往敵國和親。傳聞我的和親對(duì)象是個(gè)殘疾皇子敌土，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 44,969評(píng)論 2贊 355

吳恩達(dá)Deep Learning課程Autonomous driving - Car detection翻譯

2 - YOLO

2.1 Model Details 模型細(xì)節(jié)

2.2 - Filtering with a threshold on class scores 依據(jù)scores參數(shù)過濾

2.3 - Non-max suppression 非極大值抑制

推薦閱讀更多精彩內(nèi)容