- Attention
- Multi-Head Attention:類比多個卷積核的方式肤频,將 Attention 重復多次并把結果拼接起來驮配,從而實現多角度集中注意力卫袒。
- Self Attention:
https://blog.csdn.net/malefactor/article/details/50583474
https://blog.csdn.net/malefactor/article/details/78767781
https://blog.csdn.net/malefactor/article/details/50550211
https://spaces.ac.cn/archives/4823
https://zhuanlan.zhihu.com/p/53682800