隨著從事 Android 開發(fā)年限增加,負責(zé)的工作項目也從應(yīng)用層開發(fā)逐步過渡到 Android Framework 層開發(fā)。雖然一開始就知道 Android 知識體系的龐大,但是當(dāng)你逐漸從 Application 層向 Framework 層走的時候灼卢,你才發(fā)現(xiàn)之前懂得認知真是太少。之前更多打交道的 Activity 和 Fragment 央碟,對于 Service 和 Broadcast 涉及的很少,更多注重的是界面的布局均函、動畫亿虽、網(wǎng)絡(luò)請求等,雖然走應(yīng)用開發(fā)的話苞也,后期會關(guān)注架構(gòu)洛勉、性能優(yōu)化、Hybrid等墩朦,但是逐漸接觸 Framework 層相關(guān)模塊時候坯认,發(fā)現(xiàn)里面的知識點各種錯綜復(fù)雜,就好比講講今天分享的主題是 Android TTS 氓涣。
話不多說,先來張圖陋气,分享大綱如下:
之前受一篇文章啟發(fā)劳吠,說的是如何講解好一個技術(shù)點知識,可以分為兩部分去介紹:外部應(yīng)用維度和內(nèi)部設(shè)計維度巩趁,基本從這兩個角度出發(fā)痒玩,可以把一個技術(shù)點講的透徹淳附。同樣,我把這種方式應(yīng)用到寫作中去蠢古。
外部應(yīng)用維度
什么是 TTS
在 Android 中奴曙,TTS全稱叫做 Text to Speech,從字面就能理解它解決的問題是什么草讶,把文本轉(zhuǎn)為語音服務(wù)洽糟,意思就是你輸入一段文本信息,然后Android 系統(tǒng)可以把這段文字播報出來堕战。這種應(yīng)用場景目前比較多是在各種語音助手APP上坤溃,很多手機系統(tǒng)集成商內(nèi)部都有內(nèi)置文本轉(zhuǎn)語音服務(wù),可以讀當(dāng)前頁面上的文本信息嘱丢。同樣薪介,在一些閱讀類APP上我們也能看到相關(guān)服務(wù),打開微信讀書越驻,里面就直接可以把當(dāng)前頁面直接用語音方式播放出來汁政,特別適合哪種不方便拿著手機屏幕閱讀的場景。
TTS 技術(shù)規(guī)范
這里主要用到的是TextToSpeech類來完成缀旁,使用TextToSpeech的步驟如下:
創(chuàng)建TextToSpeech對象记劈,創(chuàng)建時傳入OnInitListener監(jiān)聽器監(jiān)聽示范創(chuàng)建成功。
設(shè)置TextToSpeech所使用語言國家選項诵棵,通過返回值判斷TTS是否支持該語言抠蚣、國家選項。
調(diào)用speak()或synthesizeToFile方法履澳。
關(guān)閉TTS嘶窄,回收資源。
XML文件
<?xml version="1.0" encoding="utf-8"?>
<RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:tools="http://schemas.android.com/tools"
android:layout_width="match_parent"
android:layout_height="match_parent">
<ScrollView
android:layout_width="match_parent"
android:layout_height="match_parent">
<LinearLayout
android:layout_width="match_parent"
android:layout_height="match_parent"
android:orientation="vertical">
<EditText
android:id="@+id/edit_text1"
android:layout_width="match_parent"
android:layout_height="wrap_content"
android:text="杭州自秦朝設(shè)縣治以來已有2200多年的歷史距贷,曾是吳越國和南宋的都城柄冲。因風(fēng)景秀麗,素有“人間天堂”的美譽忠蝗。杭州得益于京杭運河和通商口岸的便利现横,以及自身發(fā)達的絲綢和糧食產(chǎn)業(yè),歷史上曾是重要的商業(yè)集散中心阁最。" />
<Button
android:id="@+id/btn_tts1"
android:layout_width="150dp"
android:layout_height="60dp"
android:layout_marginTop="10dp"
android:text="TTS1" />
<EditText
android:id="@+id/edit_text2"
android:layout_width="match_parent"
android:layout_height="wrap_content"
android:text="伊利公開舉報原創(chuàng)始人鄭俊懷:多名高官充當(dāng)保護傘 北京青年報 2018-10-24 12:01:46 10月24日上午戒祠,伊利公司在企業(yè)官方網(wǎng)站發(fā)出舉報信,公開舉報鄭俊懷等人速种,聲稱鄭俊懷索要巨額犯罪所得不成姜盈,動用最高檢某原副檢察長等人施壓,長期造謠迫害伊利配阵,多位省部級馏颂、廳局級領(lǐng)導(dǎo)均充當(dāng)鄭俊懷保護傘示血,人為抹掉2.4億犯罪事實,運作假減刑救拉,14年來無人敢處理难审。" />
<Button
android:id="@+id/btn_tts2"
android:layout_width="150dp"
android:layout_height="60dp"
android:layout_marginTop="10dp"
android:text="TTS2" />
<Button
android:id="@+id/btn_cycle"
android:layout_width="150dp"
android:layout_height="60dp"
android:layout_marginTop="10dp"
android:text="Cycle TTS" />
<Button
android:id="@+id/btn_second"
android:layout_width="150dp"
android:layout_height="60dp"
android:layout_marginTop="10dp"
android:text="Second TTS" />
</LinearLayout>
</ScrollView>
</RelativeLayout>
Activity文件
public class TtsMainActivity extends AppCompatActivity implements View.OnClickListener,TextToSpeech.OnInitListener {
private static final String TAG = TtsMainActivity.class.getSimpleName();
private static final int THREADNUM = 100; // 測試用的線程數(shù)目
private EditText mTestEt1;
private EditText mTestEt2;
private TextToSpeech mTTS; // TTS對象
private XKAudioPolicyManager mXKAudioPolicyManager;
private HashMap mParams = null;
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
mTestEt1 = (EditText) findViewById(R.id.edit_text1);
mTestEt2 = (EditText) findViewById(R.id.edit_text2);
findViewById(R.id.btn_tts1).setOnClickListener(this);
findViewById(R.id.btn_tts2).setOnClickListener(this);
findViewById(R.id.btn_cycle).setOnClickListener(this);
findViewById(R.id.btn_second).setOnClickListener(this);
init();
}
private void init(){
mTTS = new TextToSpeech(this.getApplicationContext(),this);
mXKAudioPolicyManager = XKAudioPolicyManager.getInstance(this.getApplication());
mParams = new HashMap();
mParams.put(TextToSpeech.Engine.KEY_PARAM_STREAM, "3"); //設(shè)置播放類型(音頻流類型)
}
@Override
public void onInit(int status) {
if (status == TextToSpeech.SUCCESS) {
int result = mTTS.setLanguage(Locale.ENGLISH);
if (result == TextToSpeech.LANG_MISSING_DATA || result == TextToSpeech.LANG_NOT_SUPPORTED) {
Toast.makeText(this, "數(shù)據(jù)丟失或不支持", Toast.LENGTH_SHORT).show();
}
}
}
@Override
public void onClick(View v) {
int id = v.getId();
switch (id){
case R.id.btn_tts1:
TtsPlay1();
break;
case R.id.btn_tts2:
TtsPlay2();
break;
case R.id.btn_second:
TtsSecond();
break;
case R.id.btn_cycle:
TtsCycle();
break;
default:
break;
}
}
private void TtsPlay1(){
if (mTTS != null && !mTTS.isSpeaking() && mXKAudioPolicyManager.requestAudioSource()) {
//mTTS.setOnUtteranceProgressListener(new ttsPlayOne());
String text1 = mTestEt1.getText().toString();
Log.d(TAG, "TtsPlay1-----------播放文本內(nèi)容:" + text1);
//朗讀,注意這里三個參數(shù)的added in API level 4 四個參數(shù)的added in API level 21
mTTS.speak(text1, TextToSpeech.QUEUE_FLUSH, mParams);
}
}
private void TtsPlay2(){
if (mTTS != null && !mTTS.isSpeaking() && mXKAudioPolicyManager.requestAudioSource()) {
//mTTS.setOnUtteranceProgressListener(new ttsPlaySecond());
String text2 = mTestEt2.getText().toString();
Log.d(TAG, "TtsPlay2-----------播放文本內(nèi)容:" + text2);
// 設(shè)置音調(diào)亿絮,值越大聲音越尖(女生)告喊,值越小則變成男聲,1.0是常規(guī)
mTTS.setPitch(0.8f);
//設(shè)定語速 ,默認1.0正常語速
mTTS.setSpeechRate(1f);
//朗讀壹无,注意這里三個參數(shù)的added in API level 4 四個參數(shù)的added in API level 21
mTTS.speak(text2, TextToSpeech.QUEUE_FLUSH, mParams);
}
}
private void TtsSecond(){
Intent intent = new Intent(TtsMainActivity.this,TtsSecondAcitivity.class);
startActivity(intent);
}
private void TtsCycle(){
long millis1 = System.currentTimeMillis();
for (int i = 0; i < THREADNUM; i++) {
Thread tempThread = new Thread(new MyRunnable(i, THREADNUM));
tempThread.setName("線程" + i);
tempThread.start();
}
long millis2 = System.currentTimeMillis();
Log.d(TAG, "循環(huán)測試發(fā)音耗費時間:" + (millis2 - millis1));
}
@Override
protected void onStart() {
super.onStart();
}
@Override
protected void onStop() {
super.onStop();
}
@Override
protected void onDestroy() {
super.onDestroy();
shutDown();
}
private void shutDown(){
if(mTTS != null){
mTTS.stop();
mTTS.shutdown();
}
if(mXKAudioPolicyManager != null){
mXKAudioPolicyManager.releaseAudioSource();
}
}
/**
* 自定義線程可執(zhí)行處理
* */
class MyRunnable implements Runnable {
private int i; // 第幾個線程
private int threadNum; // 總共創(chuàng)建了幾個線程
public MyRunnable(int i, int threadNum) {
this.i = i;
this.threadNum = threadNum;
}
@Override
public void run() {
runOnUiThread(new Runnable() {
@Override
public void run() {
Log.d(TAG, "在主線程中執(zhí)行index:" + i + ",線程總數(shù):" + threadNum);
if(i % 2 == 0){
Log.d(TAG, "TtsPlay1 index:" + i);
TtsPlay1();
}
else{
Log.d(TAG, "TtsPlay2 index:" + i);
TtsPlay2();
}
try {
Thread.sleep(10000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
});
}
}
public class ttsPlayOne extends UtteranceProgressListener{
@Override
public void onStart(String utteranceId) {
Log.d(TAG, "ttsPlayOne-----------onStart");
}
@Override
public void onDone(String utteranceId) {
Log.d(TAG, "ttsPlayOne-----------onDone");
}
@Override
public void onError(String utteranceId) {
Log.d(TAG, "ttsPlayOne-----------onError");
}
}
public class ttsPlaySecond extends UtteranceProgressListener{
@Override
public void onStart(String utteranceId) {
Log.d(TAG, "ttsPlaySecond-----------onStart");
}
@Override
public void onDone(String utteranceId) {
Log.d(TAG, "ttsPlaySecond-----------onDone");
}
@Override
public void onError(String utteranceId) {
Log.d(TAG, "ttsPlaySecond-----------onError");
}
}
}
加上權(quán)限
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE"></uses-permission>
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE"></uses-permission>
TTS 最佳實踐
由于目前我在公司負責(zé)開發(fā)的產(chǎn)品是屬于語音助手類型葱绒,自然這類 TTS 發(fā)聲的問題和坑日常見的比較多。常見的有如下幾種類型:
- 系統(tǒng)自帶的 TTS 功能是不支持中文的斗锭,想要支持中文的話地淀,需要借助第三方引擎,比如常見的科大訊飛岖是、百度等帮毁。
- 如果換成支持中文引擎的話,一旦輸入的文本中有夾雜著英文豺撑,那么有時候第三方TTS引擎有時候就很不友好烈疚,有時候會把英文單詞每個字母讀出來,英文甚至是發(fā)音不了聪轿,這里就需要注意下引擎的測試爷肝。
- 在設(shè)置 TTS 參數(shù)的時候,需要注意語速陆错、音高灯抛、音調(diào)的上限值,有時候參數(shù)可能是0-100的范圍音瓷,有時候有些參數(shù)是在0-10的范圍对嚼,特別需要根據(jù)不同引擎參數(shù)的值類型去設(shè)定。
使用趨勢
隨著物聯(lián)網(wǎng)的到來绳慎,IoT設(shè)備增多纵竖,那么對于類似語音助手相關(guān)應(yīng)用也會增多,因為語音是一個很好的入口杏愤,現(xiàn)在逐步從顯示到去顯示的過程靡砌,很多智能設(shè)備有些是不需要屏幕的,只需要能識別語音和播放聲音珊楼。因此乏奥,隨著這類應(yīng)用的增長,對于TTS 相關(guān)的API接口調(diào)用頻率肯定也是加大亥曹,相信谷歌在這方面也會逐步在完善邓了。
內(nèi)部設(shè)計維度
從外部使用角度入手,基本是熟悉API接口和具體項目中應(yīng)用碰到的問題媳瞪,然后不斷總結(jié)出來比較優(yōu)化的實踐方式骗炉。了解完外部角度切入,那么我們需要里面內(nèi)部設(shè)計是怎么一回事蛇受,畢竟作為一個開發(fā)者句葵,知道具體實現(xiàn)原理是一個基本功。
解決目標(biāo)
Android TTS 目標(biāo)就是解決文本轉(zhuǎn)化為語音播報的過程兢仰。那它到底是怎么實現(xiàn)的呢乍丈,我們從TextToSpeech類的構(gòu)造函數(shù)開始分析。
這里我們用Android 6.0版本源碼分析為主把将,主要涉及的相關(guān)類和接口文件轻专,在源碼中的位置如下:
framework\base\core\java\android\speech\tts\TextToSpeech.java
framework\base/core\java/android\speech\tts\TextToSpeechService.java
external\svox\pico\src\com\svox\pico\PicoService.java
external\svox\pico\compat\src\com\android\tts\compat\CompatTtsService.java
external\svox\pico\compat\src\com\android\tts\compat\SynthProxy.java
external\svox\pico\compat\jni\com_android_tts_compat_SynthProxy.cpp
external\svox\pico\tts\com_svox_picottsengine.cpp
實現(xiàn)原理
初始化角度:先看TextToSpeech類,在使用時察蹲,一般TextToSpeech類要進行初始化请垛,它的構(gòu)造函數(shù)有三個,最后真正調(diào)用的構(gòu)造函數(shù)代碼如下:
/**
* Used by the framework to instantiate TextToSpeech objects with a supplied
* package name, instead of using {@link android.content.Context#getPackageName()}
*
* @hide
*/
public TextToSpeech(Context context, OnInitListener listener, String engine,
String packageName, boolean useFallback) {
mContext = context;
mInitListener = listener;
mRequestedEngine = engine;
mUseFallback = useFallback;
mEarcons = new HashMap<String, Uri>();
mUtterances = new HashMap<CharSequence, Uri>();
mUtteranceProgressListener = null;
mEnginesHelper = new TtsEngines(mContext);
initTts();
}
從構(gòu)造函數(shù)可以看到洽议,調(diào)用到initTts操作宗收,我們看下initTts方法里是什么東東,代碼如下:
private int initTts() {
// Step 1: Try connecting to the engine that was requested.
if (mRequestedEngine != null) {
if (mEnginesHelper.isEngineInstalled(mRequestedEngine)) {
if (connectToEngine(mRequestedEngine)) {
mCurrentEngine = mRequestedEngine;
return SUCCESS;
} else if (!mUseFallback) {
mCurrentEngine = null;
dispatchOnInit(ERROR);
return ERROR;
}
} else if (!mUseFallback) {
Log.i(TAG, "Requested engine not installed: " + mRequestedEngine);
mCurrentEngine = null;
dispatchOnInit(ERROR);
return ERROR;
}
}
// Step 2: Try connecting to the user's default engine.
final String defaultEngine = getDefaultEngine();
if (defaultEngine != null && !defaultEngine.equals(mRequestedEngine)) {
if (connectToEngine(defaultEngine)) {
mCurrentEngine = defaultEngine;
return SUCCESS;
}
}
// Step 3: Try connecting to the highest ranked engine in the
// system.
final String highestRanked = mEnginesHelper.getHighestRankedEngineName();
if (highestRanked != null && !highestRanked.equals(mRequestedEngine) &&
!highestRanked.equals(defaultEngine)) {
if (connectToEngine(highestRanked)) {
mCurrentEngine = highestRanked;
return SUCCESS;
}
}
// NOTE: The API currently does not allow the caller to query whether
// they are actually connected to any engine. This might fail for various
// reasons like if the user disables all her TTS engines.
mCurrentEngine = null;
dispatchOnInit(ERROR);
return ERROR;
}
這里比較有意思了亚兄,第一步先去連接用戶請求的TTS引擎服務(wù)(這里可以讓我們自定義TTS引擎混稽,可以替換系統(tǒng)默認的引擎),如果沒找到連接用戶的TTS引擎审胚,那么就去連接默認引擎匈勋,最后是連接高性能引擎,從代碼可以看出高性能引擎優(yōu)先級最高菲盾,默認引擎其次颓影,connectToEngine方法代碼如下:
private boolean connectToEngine(String engine) {
Connection connection = new Connection();
Intent intent = new Intent(Engine.INTENT_ACTION_TTS_SERVICE);
intent.setPackage(engine);
boolean bound = mContext.bindService(intent, connection, Context.BIND_AUTO_CREATE);
if (!bound) {
Log.e(TAG, "Failed to bind to " + engine);
return false;
} else {
Log.i(TAG, "Sucessfully bound to " + engine);
mConnectingServiceConnection = connection;
return true;
}
}
這里的Engine.INTENT_ACTION_TTS_SERVICE的值為"android.intent.action.TTS_SERVICE";其連接到的服務(wù)為action,為"android.intent.action.TTS_SERVICE"的服務(wù)懒鉴,在external\svox\pico目錄中的AndroidManifest.xml文件可以發(fā)現(xiàn):
<service android:name=".PicoService"
android:label="@string/app_name">
<intent-filter>
<action android:name="android.intent.action.TTS_SERVICE" />
<category android:name="android.intent.category.DEFAULT" />
</intent-filter>
<meta-data android:name="android.speech.tts" android:resource="@xml/tts_engine" />
</service>
系統(tǒng)自帶的默認連接的服務(wù)叫做PicoService诡挂,其具體代碼如下:其繼承于CompatTtsService。
public class PicoService extends CompatTtsService {
private static final String TAG = "PicoService";
@Override
protected String getSoFilename() {
return "libttspico.so";
}
}
我們再來看看CompatTtsService這個類临谱,這個類為抽象類璃俗,它的父類為TextToSpeechService,其有一個成員SynthProxy類,該類負責(zé)調(diào)用TTS的C++層代碼悉默。如圖:
我們來看看CompatTtsService的onCreate()方法城豁,該方法中主要對SynthProxy進行了初始化:
@Override
public void onCreate() {
if (DBG) Log.d(TAG, "onCreate()");
String soFilename = getSoFilename();
if (mNativeSynth != null) {
mNativeSynth.stopSync();
mNativeSynth.shutdown();
mNativeSynth = null;
}
// Load the engineConfig from the plugin if it has any special configuration
// to be loaded. By convention, if an engine wants the TTS framework to pass
// in any configuration, it must put it into its content provider which has the URI:
// content://<packageName>.providers.SettingsProvider
// That content provider must provide a Cursor which returns the String that
// is to be passed back to the native .so file for the plugin when getString(0) is
// called on it.
// Note that the TTS framework does not care what this String data is: it is something
// that comes from the engine plugin and is consumed only by the engine plugin itself.
String engineConfig = "";
Cursor c = getContentResolver().query(Uri.parse("content://" + getPackageName()
+ ".providers.SettingsProvider"), null, null, null, null);
if (c != null){
c.moveToFirst();
engineConfig = c.getString(0);
c.close();
}
mNativeSynth = new SynthProxy(soFilename, engineConfig);
// mNativeSynth is used by TextToSpeechService#onCreate so it must be set prior
// to that call.
// getContentResolver() is also moved prior to super.onCreate(), and it works
// because the super method don't sets a field or value that affects getContentResolver();
// (including the content resolver itself).
super.onCreate();
}
緊接著看看SynthProxy的構(gòu)造函數(shù)都干了什么,我也不知道干了什么抄课,但是里面有個靜態(tài)代碼塊唱星,其加載了ttscompat動態(tài)庫雳旅,所以它肯定只是一個代理,實際功能由C++本地方法實現(xiàn)
/**
* Constructor; pass the location of the native TTS .so to use.
*/
public SynthProxy(String nativeSoLib, String engineConfig) {
boolean applyFilter = shouldApplyAudioFilter(nativeSoLib);
Log.v(TAG, "About to load "+ nativeSoLib + ", applyFilter=" + applyFilter);
mJniData = native_setup(nativeSoLib, engineConfig);
if (mJniData == 0) {
throw new RuntimeException("Failed to load " + nativeSoLib);
}
native_setLowShelf(applyFilter, PICO_FILTER_GAIN, PICO_FILTER_LOWSHELF_ATTENUATION,
PICO_FILTER_TRANSITION_FREQ, PICO_FILTER_SHELF_SLOPE);
}
我們可以看到间聊,在構(gòu)造函數(shù)中攒盈,調(diào)用了native_setup方法來初始化引擎,其實現(xiàn)在C++層(com_android_tts_compat_SynthProxy.cpp)哎榴。
我們可以看到ngine->funcs->init(engine, __ttsSynthDoneCB, engConfigString);這句代碼比較關(guān)鍵型豁,這個init方法上面在com_svox_picottsengine.cpp中,如下:
/* Google Engine API function implementations */
/** init
* Allocates Pico memory block and initializes the Pico system.
* synthDoneCBPtr - Pointer to callback function which will receive generated samples
* config - the engine configuration parameters, here only contains the non-system path
* for the lingware location
* return tts_result
*/
tts_result TtsEngine::init( synthDoneCB_t synthDoneCBPtr, const char *config )
{
if (synthDoneCBPtr == NULL) {
ALOGE("Callback pointer is NULL");
return TTS_FAILURE;
}
picoMemArea = malloc( PICO_MEM_SIZE );
if (!picoMemArea) {
ALOGE("Failed to allocate memory for Pico system");
return TTS_FAILURE;
}
pico_Status ret = pico_initialize( picoMemArea, PICO_MEM_SIZE, &picoSystem );
if (PICO_OK != ret) {
ALOGE("Failed to initialize Pico system");
free( picoMemArea );
picoMemArea = NULL;
return TTS_FAILURE;
}
picoSynthDoneCBPtr = synthDoneCBPtr;
picoCurrentLangIndex = -1;
// was the initialization given an alternative path for the lingware location?
if ((config != NULL) && (strlen(config) > 0)) {
pico_alt_lingware_path = (char*)malloc(strlen(config));
strcpy((char*)pico_alt_lingware_path, config);
ALOGV("Alternative lingware path %s", pico_alt_lingware_path);
} else {
pico_alt_lingware_path = (char*)malloc(strlen(PICO_LINGWARE_PATH) + 1);
strcpy((char*)pico_alt_lingware_path, PICO_LINGWARE_PATH);
ALOGV("Using predefined lingware path %s", pico_alt_lingware_path);
}
return TTS_SUCCESS;
}
到這里尚蝌,TTS引擎的初始化就完成了迎变。
再看下TTS調(diào)用的角度,一般TTS調(diào)用的類是TextToSpeech中的speak()方法飘言,我們來看看其執(zhí)行流程:
public int speak(final CharSequence text,
final int queueMode,
final Bundle params,
final String utteranceId) {
return runAction(new Action<Integer>() {
@Override
public Integer run(ITextToSpeechService service) throws RemoteException {
Uri utteranceUri = mUtterances.get(text);
if (utteranceUri != null) {
return service.playAudio(getCallerIdentity(), utteranceUri, queueMode,
getParams(params), utteranceId);
} else {
return service.speak(getCallerIdentity(), text, queueMode, getParams(params),
utteranceId);
}
}
}, ERROR, "speak");
}
主要是看runAction()方法:
private <R> R runAction(Action<R> action, R errorResult, String method,
boolean reconnect, boolean onlyEstablishedConnection) {
synchronized (mStartLock) {
if (mServiceConnection == null) {
Log.w(TAG, method + " failed: not bound to TTS engine");
return errorResult;
}
return mServiceConnection.runAction(action, errorResult, method, reconnect,
onlyEstablishedConnection);
}
}
主要看下mServiceConnection類的runAction方法衣形,
public <R> R runAction(Action<R> action, R errorResult, String method,
boolean reconnect, boolean onlyEstablishedConnection) {
synchronized (mStartLock) {
try {
if (mService == null) {
Log.w(TAG, method + " failed: not connected to TTS engine");
return errorResult;
}
if (onlyEstablishedConnection && !isEstablished()) {
Log.w(TAG, method + " failed: TTS engine connection not fully set up");
return errorResult;
}
return action.run(mService);
} catch (RemoteException ex) {
Log.e(TAG, method + " failed", ex);
if (reconnect) {
disconnect();
initTts();
}
return errorResult;
}
}
}
可以發(fā)現(xiàn)最后會回調(diào)action.run(mService)方法。接著執(zhí)行service.playAudio()热凹,這里的service為PicoService泵喘,其繼承于抽象類CompatTtsService,而CompatTtsService繼承于抽象類TextToSpeechService般妙。
所以會執(zhí)行TextToSpeechService中的playAudio()纪铺,該方法位于TextToSpeechService中mBinder中。該方法如下:
@Override
public int playAudio(IBinder caller, Uri audioUri, int queueMode, Bundle params,
String utteranceId) {
if (!checkNonNull(caller, audioUri, params)) {
return TextToSpeech.ERROR;
}
SpeechItem item = new AudioSpeechItemV1(caller,
Binder.getCallingUid(), Binder.getCallingPid(), params, utteranceId, audioUri);
return mSynthHandler.enqueueSpeechItem(queueMode, item);
}
接著執(zhí)行mSynthHandler.enqueueSpeechItem(queueMode, item),其代碼如下:
/**
* Adds a speech item to the queue.
*
* Called on a service binder thread.
*/
public int enqueueSpeechItem(int queueMode, final SpeechItem speechItem) {
UtteranceProgressDispatcher utterenceProgress = null;
if (speechItem instanceof UtteranceProgressDispatcher) {
utterenceProgress = (UtteranceProgressDispatcher) speechItem;
}
if (!speechItem.isValid()) {
if (utterenceProgress != null) {
utterenceProgress.dispatchOnError(
TextToSpeech.ERROR_INVALID_REQUEST);
}
return TextToSpeech.ERROR;
}
if (queueMode == TextToSpeech.QUEUE_FLUSH) {
stopForApp(speechItem.getCallerIdentity());
} else if (queueMode == TextToSpeech.QUEUE_DESTROY) {
stopAll();
}
Runnable runnable = new Runnable() {
@Override
public void run() {
if (isFlushed(speechItem)) {
speechItem.stop();
} else {
setCurrentSpeechItem(speechItem);
speechItem.play();
setCurrentSpeechItem(null);
}
}
};
Message msg = Message.obtain(this, runnable);
// The obj is used to remove all callbacks from the given app in
// stopForApp(String).
//
// Note that this string is interned, so the == comparison works.
msg.obj = speechItem.getCallerIdentity();
if (sendMessage(msg)) {
return TextToSpeech.SUCCESS;
} else {
Log.w(TAG, "SynthThread has quit");
if (utterenceProgress != null) {
utterenceProgress.dispatchOnError(TextToSpeech.ERROR_SERVICE);
}
return TextToSpeech.ERROR;
}
}
主要是看 speechItem.play()方法碟渺,代碼如下:
/**
* Plays the speech item. Blocks until playback is finished.
* Must not be called more than once.
*
* Only called on the synthesis thread.
*/
public void play() {
synchronized (this) {
if (mStarted) {
throw new IllegalStateException("play() called twice");
}
mStarted = true;
}
playImpl();
}
protected abstract void playImpl();
可以看到主要播放實現(xiàn)方法為playImpl()鲜锚,那么在TextToSpeechService中的playAudio()中代碼可以知道這里的speechitem為SynthesisSpeechItemV1。
因此在play中執(zhí)行的playimpl()方法為SynthesisSpeechItemV1類中的playimpl()方法苫拍,其代碼如下:
@Override
protected void playImpl() {
AbstractSynthesisCallback synthesisCallback;
mEventLogger.onRequestProcessingStart();
synchronized (this) {
// stop() might have been called before we enter this
// synchronized block.
if (isStopped()) {
return;
}
mSynthesisCallback = createSynthesisCallback();
synthesisCallback = mSynthesisCallback;
}
TextToSpeechService.this.onSynthesizeText(mSynthesisRequest, synthesisCallback);
// Fix for case where client called .start() & .error(), but did not called .done()
if (synthesisCallback.hasStarted() && !synthesisCallback.hasFinished()) {
synthesisCallback.done();
}
}
在playImpl方法中會執(zhí)行onSynthesizeText方法芜繁,這是個抽象方法,記住其傳遞了一個synthesisCallback绒极,后面會講到骏令。哪該方法具體實現(xiàn)是在哪里呢,沒錯垄提,就是在TextToSpeechService的子類CompatTtsService中榔袋。來看看它怎么實現(xiàn)的:
@Override
protected void onSynthesizeText(SynthesisRequest request, SynthesisCallback callback) {
if (mNativeSynth == null) {
callback.error();
return;
}
// Set language
String lang = request.getLanguage();
String country = request.getCountry();
String variant = request.getVariant();
if (mNativeSynth.setLanguage(lang, country, variant) != TextToSpeech.SUCCESS) {
Log.e(TAG, "setLanguage(" + lang + "," + country + "," + variant + ") failed");
callback.error();
return;
}
// Set speech rate
int speechRate = request.getSpeechRate();
if (mNativeSynth.setSpeechRate(speechRate) != TextToSpeech.SUCCESS) {
Log.e(TAG, "setSpeechRate(" + speechRate + ") failed");
callback.error();
return;
}
// Set speech
int pitch = request.getPitch();
if (mNativeSynth.setPitch(pitch) != TextToSpeech.SUCCESS) {
Log.e(TAG, "setPitch(" + pitch + ") failed");
callback.error();
return;
}
// Synthesize
if (mNativeSynth.speak(request, callback) != TextToSpeech.SUCCESS) {
callback.error();
return;
}
}
最終又回到系統(tǒng)提供的pico引擎中,在com_android_tts_compat_SynthProxy.cpp這個文件中铡俐,可以看到使用speak方法凰兑,代碼如下:
static jint
com_android_tts_compat_SynthProxy_speak(JNIEnv *env, jobject thiz, jlong jniData,
jstring textJavaString, jobject request)
{
SynthProxyJniStorage* pSynthData = getSynthData(jniData);
if (pSynthData == NULL) {
return ANDROID_TTS_FAILURE;
}
initializeFilter();
Mutex::Autolock l(engineMutex);
android_tts_engine_t *engine = pSynthData->mEngine;
if (!engine) {
return ANDROID_TTS_FAILURE;
}
SynthRequestData *pRequestData = new SynthRequestData;
pRequestData->jniStorage = pSynthData;
pRequestData->env = env;
pRequestData->request = env->NewGlobalRef(request);
pRequestData->startCalled = false;
const char *textNativeString = env->GetStringUTFChars(textJavaString, 0);
memset(pSynthData->mBuffer, 0, pSynthData->mBufferSize);
int result = engine->funcs->synthesizeText(engine, textNativeString,
pSynthData->mBuffer, pSynthData->mBufferSize, static_cast<void *>(pRequestData));
env->ReleaseStringUTFChars(textJavaString, textNativeString);
return (jint) result;
}
至此,TTS的調(diào)用就結(jié)束了审丘。
TTS 優(yōu)劣勢
從實現(xiàn)原理我們可以看到Android系統(tǒng)原生自帶了一個TTS引擎吏够。那么在此,我們就也可以去自定義TTS引擎,只有繼承ITextToSpeechService接口即可锅知,實現(xiàn)里面的方法播急。這就為后續(xù)自定義TTS引擎埋下伏筆了,因為系統(tǒng)默認的TTS引擎是不支持中文喉镰,那么市場上比較好的TTS相關(guān)產(chǎn)品旅择,一般是集成訊飛或者Nuance等第三方供應(yīng)商。
因此侣姆,我們也可以看到TTS優(yōu)劣勢。
優(yōu)勢:接口定義完善沉噩,有著完整的API接口方法捺宗,同時支持?jǐn)U展,可根據(jù)自身開發(fā)業(yè)務(wù)需求重新打造TTS引擎川蒙,并且與原生接口做兼容蚜厉,可適配。
劣勢:原生系統(tǒng)TTS引擎支持的多國語言有限畜眨,目前不支持多實例和多通道昼牛。
演進趨勢
從目前來看,隨著語音成為更多Iot設(shè)備的入口康聂,那么在語音TTS合成播報方面技術(shù)會越來越成熟贰健,特別是對于Android 系統(tǒng)原生相關(guān)的接口也會越來越強大。因此恬汁,對于TTS后續(xù)的發(fā)展伶椿,應(yīng)該是冉冉上升。
小結(jié)
總的來說氓侧,對于一個知識點脊另,前期通過使用文檔介紹,到具體實踐约巷,然后在實踐中優(yōu)化進行總結(jié)偎痛,選擇一個最佳的實踐方案。當(dāng)然不能滿足“知其然而不知其所以然”独郎,所以得去看背后的實現(xiàn)原理是什么踩麦。這個知識點優(yōu)劣勢是什么,在哪些場景比較適用囚聚,哪些場景不適用靖榕,接下來會演進趨勢怎么樣。通過這么一整套流程顽铸,那么對于一個知識點來說茁计,可以算是了然于胸了。