![240](https://cdn2.jianshu.io/assets/default_avatar/13-394c31a9cb492fcb39c27422ca7d2815.jpg?imageMogr2/auto-orient/strip|imageView2/1/w/240/h/240)
Introduction VC aims to convert the non-linguistic information of the sp...
Introduction The ASR system can be categoried as three classes by its ou...
Background Automatic Speech Recognition (ASR) uses both acoustic model (...
Introduction In the previous articals, we have learnt the CTC loss makes...
Introduction Keyword Spotting (KWS) aims at detecting predefined key-wor...
Multi-headed Attention 一個attention head可能權(quán)重大部分在某處坟岔,不能提取豐富的信息,需要多個進行融合钓觉。 Fu...
注意力機制 RNN編碼-解碼模型 論文[1]中衫贬,從RNN編碼-解碼模型演進出注意力機制朝氓。RNN編碼-解碼模型中缸逃,編碼器輸入序列少辣,是編碼器RNN在...
背景 手寫體識別、語音識別中笤成,輸入數(shù)據(jù)和輸出的識別結(jié)果長度不一致评架、而且可變。直接用神經(jīng)網(wǎng)絡(luò)訓(xùn)練需要預(yù)分割炕泳、調(diào)整纵诞,得到對應(yīng)關(guān)系,這很難做到培遵。CTC...
網(wǎng)絡(luò)架構(gòu) 可以分為3個部分 Head Region Proposal Network(RPN) Classification Network R...