一 寫在前面
未經(jīng)允許,不得轉(zhuǎn)載载绿,謝謝~~
最近想看看long-tailed recognition是如何處理imbalanced dataset的,對查閱到比較有用的資料做了一個整理和記錄油航。
二 overview
2.1 基本問題介紹
大多數(shù)我們用的benchmark都是類別均衡的(每個類別的標(biāo)注樣本數(shù)一致)崭庸,但是事實上自然界中的物體很可能是一個類別均衡的分布,常見類別樣本多,稀有類別樣本少怕享,更直觀的解釋可以看下面這張圖执赡。
long-tailed recognition解決的就是數(shù)據(jù)呈現(xiàn)這樣長尾分布時候的識別問題。
2.2 資料推薦
這邊推薦兩個我覺得很不錯的link
比較好的中文blog:
https://zhuanlan.zhihu.com/p/158638078long-tailed paper link:
https://github.com/zwzhang121/Awesome-of-Long-Tailed-Recognition
三 typical paper list
根據(jù)現(xiàn)有的四大類方法(re-sampling函筋,re-weighting沙合,transfer learning,else)跌帐,綜合根據(jù)以上的資料和文章的引用量首懈,code開源等情況整理了以下list,供需要的同學(xué)使用~
3.1 re-sampling
通過影響樣本采樣頻率來達到balance谨敛,又可以分為頭部類別欠采樣(under-sampling)和尾部類別過采樣(over-sampling)兩個細分類別究履。
paper:ICLR2020: Decoupling Representation and Classifier for Long-Tailed Recognition, ICLR 2020 (star300+)
- paper:https://arxiv.org/abs/1910.09217
- code: https://github.com/facebookresearch/classifier-balancing
paper:BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition,CVPR 2020 (star300+)
3.2 re-weighting
此類方法主要表現(xiàn)在分類loss上脸狸,對loss進行加權(quán)最仑。
paper:Class-Balanced Loss Based on Effective Number of Samples,CVPR 2019 (star300+)
paper:Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss炊甲,NIPS 2019(star300+)
3.3 transfer learning
希望將知識從頭部類遷移到尾部類別泥彤。
paper:Large-Scale Long-Tailed Recognition in an Open World,CVPR 2019 (star500+)
paper:Deep Representation Learning on Long-tailed Data: A Learnable Embedding Augmentation Perspective卿啡,CVPR 2020
paper:Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification吟吝,ECCV 2020
3.4 else
paper:Long-tailed Recognition by Routing Diverse Distribution-Aware Experts, arxiv 2020
paper:ResLT: Residual Learning for Long-tailed Recognition,arxiv2021
- https://arxiv.org/abs/2101.10633
- https://github.com/FPNAS/ResLT (還沒有release颈娜,210224)