ECCV 2016 person re-identification相關(guān) 第三篇
目前做person reID主要有兩個(gè)方面:
- extracting and coding local invariant features to represent the visual appearance of a person
- learning a discriminative distance metric hence the distance of features from the same person can be smaller
在這篇文章中,作者聚焦于第一方面弄企。在features方面超燃,作者使用人的attribute,作者認(rèn)為人的attribute屬于mid level拘领,這相對(duì)于color LBP這種low level的視覺(jué)特征意乓,能更好的應(yīng)對(duì)光照、角度约素、姿勢(shì)等變化因素届良。但是人工標(biāo)注attribute的數(shù)據(jù)比傳統(tǒng)的標(biāo)注數(shù)據(jù)的代價(jià)更為昂貴,所以目前還是以low level的特征為主圣猎。
作者提出的方法的優(yōu)勢(shì)在于:
- we propose a three- stage semi-supervised deep attribute learning algorithm, which makes learning a large set of human attributes from a limited number of labeled attribute data possible
- deep attributes achieve promising performance and generalization ability on four person ReID datasets
- deep attributes release the previous dependencies on local features, thus make the person ReID system more robust and efficient
作者使用dCNN來(lái)預(yù)測(cè)human attributes士葫,并用于person reID,這是original的样漆。
主要思想
作者的目的就是要學(xué)習(xí)一個(gè)attribute detector O:
其中I是輸入圖像为障,AI是binary的attribute。比如這個(gè)attribute是[有長(zhǎng)發(fā)放祟,女性鳍怨,戴眼鏡],那么對(duì)應(yīng)就是[1跪妥,1鞋喇,1]。如果是[短發(fā)眉撵,女性侦香,沒(méi)戴眼鏡]落塑,那么就是[0,1罐韩,0]
前文提到了他們這是一個(gè)three-stage semi-supervised deep attribute learning algorithm憾赁,所以這里分為三個(gè)part來(lái)介紹
第一步
用AlexNet來(lái)訓(xùn)練,訓(xùn)練數(shù)據(jù)寶貴的有標(biāo)注attribute的數(shù)據(jù)散吵,一共N個(gè)sample龙考,第n個(gè)sample tn的attribute是An。得到第一步訓(xùn)練的attribute detector O1
第二步
We denote the dataset with person ID labels as U = {u1, u2, ..., uM }, where M is the number of samples and each sample has a person ID label l, e.g., the m-th instance um has person ID lm.
可以理解為矾睦,這部分?jǐn)?shù)據(jù)是沒(méi)有attribute label標(biāo)注的晦款,但是有ID標(biāo)注,就是標(biāo)準(zhǔn)的用于re ID 的數(shù)據(jù)枚冗,比如這一個(gè)sequence是誰(shuí)缓溅,那個(gè)sequence是誰(shuí) 。選一個(gè)sequence 作為anchor 一個(gè)sequence作為positive 再一個(gè)sequence作為negative a triplet [u(a),u(p),u(n)] is constructed 然后拿第一步里訓(xùn)練好的模型O1分別預(yù)測(cè)這三個(gè)sequence 赁温,最小化anchor和positive之間的distance 最大化anchor 和 negative之間的distance 坛怪。
目標(biāo)函數(shù)如下:
We denote the dataset with person ID labels as U = {u1, u2, ..., uM }, where M is the number of samples and each sample has a person ID label l, e.g., the m-th instance um has person ID lm. 可以理解為,這部分?jǐn)?shù)據(jù)是沒(méi)有attribute label標(biāo)注的束世,但是有ID標(biāo)注酝陈,就是標(biāo)準(zhǔn)的用于re ID 的數(shù)據(jù),比如這一個(gè)sequence是誰(shuí)毁涉,那個(gè)sequence是誰(shuí) 沉帮。選一個(gè)sequence 作為anchor 一個(gè)sequence作為positive 再一個(gè)sequence作為negative a triplet [u(a),u(p),u(n)] is constructed 然后拿第一步里訓(xùn)練好的模型分別預(yù)測(cè)這三個(gè)sequence ,最小化anchor和positive之間的distance 最大化anchor 和 negative之間的distance loss function如下
However, the person ID label is not strong enough to train the dCNN with accurate attributes. Without proper constraints, the above loss function may generate meaningless attribute labels and easily over- fit the training dataset U. For example, imposing a large number meaningless attributes to two samples of a person may decrease the distance between their attribute labels, but does not help to improve the discriminative power of the dCNN. Therefore, we add several regularization terms and modify the original loss function as:
由此得到第二步訓(xùn)練的attribute detector O2
第三步
拿O2對(duì)第二步的數(shù)據(jù)庫(kù)進(jìn)行attribute的detect贫堰,得到一個(gè)有attribute標(biāo)注的新的數(shù)據(jù)庫(kù)穆壕,然后把這個(gè)數(shù)據(jù)庫(kù)和第一步的數(shù)據(jù)庫(kù)結(jié)合在一起,再對(duì)O2進(jìn)行fine-tuning