好久不見哈 一下子就快月底啦 (已經(jīng)滿心歡喜期待五一啦嘻嘻)
最近更新都是圍繞域適應(yīng) 20/21 較新的論文(arxiv上的)
大都數(shù)網(wǎng)上還沒有出現(xiàn)解讀材料束析,故記錄僅自我理解,若有偏差可簡信交流憎亚。
論文名稱:
《A Transductive Multi-Head Model for Cross-Domain Few-Shot Learning》
論文地址:https://arxiv.org/abs/2006.11384v1
論文代碼:https://github.com/leezhp1994/TMHFS
本篇文章只記錄個人閱讀論文的筆記员寇,具體翻譯、代碼等不展開虽填,詳細(xì)可見上述的鏈接.
Background
之前的論文閱讀中提過了幾次小樣本域適應(yīng)問題的背景(提出)丁恭,這邊就不再詳細(xì)敘述,簡單摘錄幾句斋日。
The main challenge of cross-domain few-shot learning lies in the cross domain divergences in both the input
data space and the output label space;
主要挑戰(zhàn)在于輸入數(shù)據(jù)空間和輸出標(biāo)簽空間中的跨域差異牲览;
Work
In this paper, we present a new method, Transductive Multi-Head Few-Shot learning (TMHFS), to address the cross-domain few-shot learning challenge
針對跨域問題,提出了TMHFS模型(Transductive Multi-Head Few-Shot learning),轉(zhuǎn)導(dǎo)(直推)多頭小樣本學(xué)習(xí)模型。
TMHFS is based on the Meta-Confidence Transduction (MCT) and Dense Feature-Matching Networks (DFMN) method
It extends the transductive model by adding an instance-wise global classification network based on the
semantic information, after the common feature embedding network as a new prediction “head”.
多頭:也就是說整個模型的基礎(chǔ)是MCT和DFMN,在這個基礎(chǔ)上加入了一個基于語義信息的實例全局分類網(wǎng)絡(luò)蔓纠,將其公共特征嵌入網(wǎng)絡(luò)作為一種新的預(yù)測粘都。(“兩頭變成三頭”)
Model
Problem Statement
In cross-domain few-shot learning setting, we have a source domain S = {Xs, Ys} from a total Cg classes and a target domain T = {Xt, Yt} from a set of totally different classes.(跨域問題定義)
The two domains have different marginal distributions in the input feature space, and disjoint output class sets.
模型如上圖所示幸斥,整個模型包含三個過程以及三個頭。
三個過程:
train:根據(jù)圖中箭頭可以看出,訓(xùn)練過程三個頭都使用了,即使用MCT(基于距離的實例元訓(xùn)練分類器)利花、DFMN(像素分類器)和基于全局信息的語義分類器來訓(xùn)練嵌入網(wǎng)絡(luò)。
fine-tining:我們只使用語義全局分類器和目標(biāo)域中的支持實例來微調(diào)模型载佳。
test:我們使用MCT部分炒事,即元訓(xùn)練的實例分類器,用微調(diào)的嵌入網(wǎng)絡(luò)來預(yù)測查詢集的標(biāo)簽蔫慧。三個頭:
MCT:
可參考此文《 Transductive few-shot learning with meta-learned confidence》
https://arxiv.org/pdf/2002.12017.pdf
The MCT uses distance based prototype classifier to make pre?diction for the query instances
(使用基于距離的原型分類器對查詢實例進(jìn)行預(yù)測,感覺有點是基于原型網(wǎng)絡(luò)的基礎(chǔ))
DFMN:
used solely in the training stage(看公式和度量學(xué)習(xí)有點類似挠乳,但這應(yīng)該是直推/轉(zhuǎn)導(dǎo)學(xué)習(xí)的通用,后續(xù)看還需要看一下這方面的知識姑躲。)
值得注意的是睡扬,以上兩個基礎(chǔ)結(jié)構(gòu)共享相同的特征提取網(wǎng)絡(luò)。
a new global instance-wise prediction head
對于這個預(yù)測頭黍析,我們考慮了所有Cg類上的全局分類問題卖怜。如圖1所示,支持集和查詢集都用于作為該分支的訓(xùn)練輸入橄仍,例如:Loss
Training stage.
The purpose of training is to pre-train an embedding model fθ (i.e., the feature extractor) in the source domain.
Fine-tuning stage:
Given a few-shot learning task in the target domain, we fine-tune the embedding model fθ on the
support set by using only the instance-wise prediction head fδ, aiming to adapt fθ to the target domain data
Experiments
總的來說整個模型是基于轉(zhuǎn)導(dǎo)/直推學(xué)習(xí)上的集成(多頭)
從實驗上來看效果也還不錯
Ending~
希望四月依舊好運 加油呀韧涨!五月