1 intro
code: https://github.com/ChengHan111/E2VPT
task: parameter-efficient learning
method: effective and efficient visual prompt tuning (E^2VPT)
three types of existing parameter-efficient learning methods:
partial tuning: finetune part of the backbone e.g., the cls head or last layers
extra module: insert learnable bias or additional adapters
Prompt tuning: add prompt tokens but without changing or fine-tuning backbone
image.png
limitations of existing work:
1) 現(xiàn)有方法沒有改變transformer最核心的key-value操作;
2) 現(xiàn)有方法還是不夠極致節(jié)省計(jì)算量
2 this paper
main idea:
1) prompt:visual tokens, + add learnable tokens into key-value prompts
2) prune:redunce the number of learnable parameters by pruning unnecessary prompts
image.png
- 文章做法:對visual prompt和key-value prompt都進(jìn)行efficient tuning;
對比的baselines & exp
image.png