PLATINUM: A method to extract excitation signals for voice synthesis system

Roughly two types of systems for voice synthesis have been proposed. One is based on the time domain pitch synchronous overlap-add (TD-PSOLA [2]), which synthesizes a voice using the short time waveform directly extracted from the input signal. The other is based on a vocoder [3], which analyzes a voice in terms of its pitch (fundamental frequency; F0) and timbre (spectral envelope) and synthesizes it with the estimated parameters.

TD-PSOLA直接使用從音頻庫中提取的波形圖合成語音;vocoder分析語音的音調(基頻)和音色(頻譜包絡)并結合一些估計得到的參數(shù)合成語音

TD-PSOLA and vocoders have trade offs. TD-PSOLA synthesizes voice with better quality than vocoders; however, vocoders can manipulate pitch and voice timbre independently.

TD-PSOLA合成效果好,但是vocoder可以控制音調和音色

The STRAIGHT [7] and TANDEM- STRAIGHT [8] have been proposed to solve this problem. They use the pitch synchronous analysis [9] to improve the estimation performance of the spectral envelope

pitch synchronous analysis是什么?待續(xù)

Furthermore, aperiodicity is used as the parameter to represent not only the periodic signal but also the aperiodic signal

AP可以表示周期信號以及非周期信號

In STRAIGHT and TANDEM-STRAIGHT, the aperiodicity is defined as the spectrum to synthesize both periodic and aperiodic signals.The periodic and the aperiodic spectra are calculated using the spectral envelope and aperiodicity, and the periodic and aperiodic signals are individually calculated

AP是可以生成周期信號和非周期信號的頻譜圖好唯。周期頻譜和非周期頻譜是用頻譜包絡和AP計算得到,而且周期信號和非周期信號是獨立計算的糊治。

This approach cannot represent the phase of the input voice because the periodic signal is calculated as the minimum phase response, and the vocal tract response hetT generally includes not only minimum phase response but also maximum phase response. To accurately synthesize a voice, it is essential to extract the phase of the input voice. We used a waveform-based parameter as a new parameter instead of aperiodicity.

待續(xù)

PLATINUM extracts the waveform- based parameter to reconstruct the input voice.

** PLATINUM提取波形圖的參數(shù)碌燕,重建輸入語音恒水。**

The proposed system equals vocoder-based systems except that it uses the excitation signal instead of aperiodicity, which therefore suggests that it is possible for the proposed system to independently manipulate the F0 and spectral envelope like vocoder-based systems.

該系統(tǒng)等價于vocoder-based systems咒劲,但是他使用激勵信號替換了AP顷蟆,這表明這個系統(tǒng)可以獨立的控制音調(基頻)和音色(頻譜包絡)

The observed spectrum Ye!T is defined as the product of the spectral envelope He!T and target spectrum for reconstructing the waveform. The target spectrum Xe!T is given by,Since the phase of He!T for vocoder-based systems is generally the minimum phase, the maximum phase of the input voice is included in Xe!T. The power of Xe!T is nearly flat, provided that the spectral envelope is accurately estimated. If He!T does not include any zeros, the inverse spectrum can be calculated reliably.

觀察到的頻譜Y是由頻譜包絡和用于重建波形圖的目標頻譜的產(chǎn)物。目標頻譜X是有如下公式獲得的腐魂。既然頻譜包絡中的相位是最小相位帐偎,那么輸入信號的最大相位在X中,X的能量幾乎平穩(wěn)的蛔屹,這表明頻譜包絡的估算值準確的削樊。如果H中不包含0,那么H的倒數(shù)可以計算獲得兔毒。

To estimate Xe!T, determining the temporal positions for
windowing is an important problem. PLATINUM uses the F0 contour and waveform. First, the voiced section is estimated based on the F0 contour, and the temporal position with maximum value of yetT2 is then extracted as the basic temporal position. The other positions are automatically calculated based on the basic position and F0 contour.

在估計X的時候漫贞,測定窗口的位置是關鍵。PLATINUM使用基頻等高線和波形圖育叁。首先绕辖,語音部分由基頻登高線估計得到,然后獲得具有最大值yt2的時間位置擂红,并以此作為基礎時間位置,剩下的位置自動的通過基礎時間位置和基頻等高線計算得到。

總結

f0基頻代表音調的高低昵骤,女生偏高树碱,男生偏低。sp代表音色变秦,吉他和鋼琴的音色就不一樣成榜,ap代表說話的內(nèi)容,比如”你好嗎“蹦玫,ap可能涉及到拼音中的1234聲赎婚。用提取激勵信號的方式代替ap,能取得更好的結果樱溉。

?著作權歸作者所有,轉載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末挣输,一起剝皮案震驚了整個濱河市,隨后出現(xiàn)的幾起案子福贞,更是在濱河造成了極大的恐慌撩嚼,老刑警劉巖,帶你破解...
    沈念sama閱讀 218,386評論 6 506
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件挖帘,死亡現(xiàn)場離奇詭異完丽,居然都是意外死亡,警方通過查閱死者的電腦和手機拇舀,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 93,142評論 3 394
  • 文/潘曉璐 我一進店門逻族,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人骄崩,你說我怎么就攤上這事聘鳞。” “怎么了刁赖?”我有些...
    開封第一講書人閱讀 164,704評論 0 353
  • 文/不壞的土叔 我叫張陵搁痛,是天一觀的道長。 經(jīng)常有香客問我宇弛,道長鸡典,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 58,702評論 1 294
  • 正文 為了忘掉前任枪芒,我火速辦了婚禮彻况,結果婚禮上,老公的妹妹穿的比我還像新娘舅踪。我一直安慰自己纽甘,他們只是感情好,可當我...
    茶點故事閱讀 67,716評論 6 392
  • 文/花漫 我一把揭開白布抽碌。 她就那樣靜靜地躺著悍赢,像睡著了一般。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上左权,一...
    開封第一講書人閱讀 51,573評論 1 305
  • 那天皮胡,我揣著相機與錄音,去河邊找鬼赏迟。 笑死屡贺,一個胖子當著我的面吹牛,可吹牛的內(nèi)容都是我干的锌杀。 我是一名探鬼主播甩栈,決...
    沈念sama閱讀 40,314評論 3 418
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼糕再!你這毒婦竟也來了量没?” 一聲冷哼從身側響起,我...
    開封第一講書人閱讀 39,230評論 0 276
  • 序言:老撾萬榮一對情侶失蹤亿鲜,失蹤者是張志新(化名)和其女友劉穎允蜈,沒想到半個月后,有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體蒿柳,經(jīng)...
    沈念sama閱讀 45,680評論 1 314
  • 正文 獨居荒郊野嶺守林人離奇死亡饶套,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點故事閱讀 37,873評論 3 336
  • 正文 我和宋清朗相戀三年,在試婚紗的時候發(fā)現(xiàn)自己被綠了垒探。 大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片妓蛮。...
    茶點故事閱讀 39,991評論 1 348
  • 序言:一個原本活蹦亂跳的男人離奇死亡,死狀恐怖圾叼,靈堂內(nèi)的尸體忽然破棺而出蛤克,到底是詐尸還是另有隱情,我是刑警寧澤夷蚊,帶...
    沈念sama閱讀 35,706評論 5 346
  • 正文 年R本政府宣布构挤,位于F島的核電站,受9級特大地震影響惕鼓,放射性物質發(fā)生泄漏筋现。R本人自食惡果不足惜,卻給世界環(huán)境...
    茶點故事閱讀 41,329評論 3 330
  • 文/蒙蒙 一箱歧、第九天 我趴在偏房一處隱蔽的房頂上張望矾飞。 院中可真熱鬧,春花似錦呀邢、人聲如沸洒沦。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,910評論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽申眼。三九已至瞒津,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間豺型,已是汗流浹背仲智。 一陣腳步聲響...
    開封第一講書人閱讀 33,038評論 1 270
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留姻氨,地道東北人。 一個月前我還...
    沈念sama閱讀 48,158評論 3 370
  • 正文 我出身青樓剪验,卻偏偏與公主長得像肴焊,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個殘疾皇子功戚,可洞房花燭夜當晚...
    茶點故事閱讀 44,941評論 2 355

推薦閱讀更多精彩內(nèi)容